How Microsoft is Using Machine Learning to Secure its Software Development Cycle
Blog#7: How Microsoft is Using Machine Learning to Secure its Software
Development Cycle
Name: Aileen 蔣慧玲
Student ID: D0726917
Recently, Microsoft is reported to be
building a machine learning classification system to secure the software
development lifecycle. The machine learning system will help in classifying
bugs as security or non-security and critical or non-critical which will
provide a level of accuracy, similar to the ones that are provided by security
experts.
According to the article, Microsoft
has collected 13 million work items and bugs since 2001 and has spent approximately
$150,000 per issue as a whole to mitigate bugs and vulnerabilities. However, it
is said that there are more than 45,000 developers that are working to find
solutions for the problem. Microsoft stated, “We used that data to
develop a process and machine learning model that correctly distinguishes
between security and non-security bugs 99% of the time, and accurately
identifies the critical, high priority security bugs 97% of the time.”
Behind The Classification System
The developers apply five processes to
build a machine learning model that can give a maximum accuracy;
- Data
collection: identify
all data types, sources, and evaluate its quality.
- Data
curation and approval: review the data and confirm whether the labels are
correct.
- Modelling
and evaluation: when a data modelling technique is selected, the
model is trained, and the performance is evaluated.
- Evaluation
of model in production: monitor the average number of bugs and
manually review a random sampling of bugs.
- Automated re-training: make
sure that the bug modelling system keeps the right pace with the
ever-evolving products at Microsoft.
How It Works
There are two steps of machine
learning model operation to classify bugs accurately:
1. Learn how to classify security and
non-security bugs.
2. Apply severity labels such as
critical, important, and low-impact to the security bugs.
Wrapping Up
After applying the machine learning
classification system, the developers can accurately classify which work items
are security bugs 99% of the time and also 97% accuracy rate when it comes to
labeling critical and non-critical security bugs.
Comments
Post a Comment