How Microsoft is Using Machine Learning to Secure its Software Development Cycle

June 01, 2020

Blog#7: How Microsoft is Using Machine Learning to Secure its Software Development Cycle

Name: Aileen 蔣慧玲

Student ID: D0726917

Source: https://analyticsindiamag.com/how-microsoft-is-using-machine-learning-to-secure-its-software-development-cycle/

Recently, Microsoft is reported to be building a machine learning classification system to secure the software development lifecycle. The machine learning system will help in classifying bugs as security or non-security and critical or non-critical which will provide a level of accuracy, similar to the ones that are provided by security experts.

According to the article, Microsoft has collected 13 million work items and bugs since 2001 and has spent approximately $150,000 per issue as a whole to mitigate bugs and vulnerabilities. However, it is said that there are more than 45,000 developers that are working to find solutions for the problem. Microsoft stated, “We used that data to develop a process and machine learning model that correctly distinguishes between security and non-security bugs 99% of the time, and accurately identifies the critical, high priority security bugs 97% of the time.”

Behind The Classification System

The developers apply five processes to build a machine learning model that can give a maximum accuracy;

Data collection: identify all data types, sources, and evaluate its quality.
Data curation and approval: review the data and confirm whether the labels are correct.
Modelling and evaluation: when a data modelling technique is selected, the model is trained, and the performance is evaluated.
Evaluation of model in production: monitor the average number of bugs and manually review a random sampling of bugs.
Automated re-training: make sure that the bug modelling system keeps the right pace with the ever-evolving products at Microsoft.

How It Works

There are two steps of machine learning model operation to classify bugs accurately:

1. Learn how to classify security and non-security bugs.

2. Apply severity labels such as critical, important, and low-impact to the security bugs.

Wrapping Up

After applying the machine learning classification system, the developers can accurately classify which work items are security bugs 99% of the time and also 97% accuracy rate when it comes to labeling critical and non-critical security bugs.

Search This Blog

BIBA students at FCU blog about big data