Software Repositories and Machine Learning Research in Cyber Security
Mounika Vanamala, Keith Bryant, Alex Caravella

TL;DR
This paper explores how machine learning techniques can be applied to cyber security repositories to improve early vulnerability detection during software development, aiming to enhance automation and security.
Contribution
It introduces a comprehensive approach using various supervised machine learning methods to connect software requirements with vulnerabilities, advancing beyond previous unsupervised techniques.
Findings
Successful use of topic modeling and unsupervised ML for vulnerability detection
Proposed transition to supervised ML methods like SVM and neural networks
Potential for improved automation in identifying software vulnerabilities
Abstract
In today's rapidly evolving technological landscape and advanced software development, the rise in cyber security attacks has become a pressing concern. The integration of robust cyber security defenses has become essential across all phases of software development. It holds particular significance in identifying critical cyber security vulnerabilities at the initial stages of the software development life cycle, notably during the requirement phase. Through the utilization of cyber security repositories like The Common Attack Pattern Enumeration and Classification (CAPEC) from MITRE and the Common Vulnerabilities and Exposures (CVE) databases, attempts have been made to leverage topic modeling and machine learning for the detection of these early-stage vulnerabilities in the software requirements process. Past research themes have returned successful outcomes in attempting to automate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Information and Cyber Security · Software Reliability and Analysis Research
MethodsLinear Discriminant Analysis
