Detecting Security Patches via Behavioral Data in Code Repositories
Nitzan Farhi, Noam Koenigstein, Yuval Shavitt

TL;DR
This paper introduces a language-oblivious system that detects security patches in Git repositories based solely on developer behavior, achieving high accuracy without analyzing code or commit messages.
Contribution
The novel approach automatically identifies security patches using behavioral data, without relying on code analysis or commit message content.
Findings
Achieved 88.3% accuracy in detecting security patches.
F1 Score of 89.8% demonstrates high reliability.
First language-oblivious solution for this problem.
Abstract
The absolute majority of software today is developed collaboratively using collaborative version control tools such as Git. It is a common practice that once a vulnerability is detected and fixed, the developers behind the software issue a Common Vulnerabilities and Exposures or CVE record to alert the user community of the security hazard and urge them to integrate the security patch. However, some companies might not disclose their vulnerabilities and just update their repository. As a result, users are unaware of the vulnerability and may remain exposed. In this paper, we present a system to automatically identify security patches using only the developer behavior in the Git repository without analyzing the code itself or the remarks that accompanied the fix (commit message). We showed we can reveal concealed security patches with an accuracy of 88.3% and F1 Score of 89.8%. This is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Information and Cyber Security · Advanced Malware Detection Techniques
