Revisiting Vulnerability Patch Identification on Data in the Wild
Ivana Clairine Irsan, Ratnadira Widyasari, Ting Zhang, Huihui Huang, Ferdian Thung, Yikun Li, Lwin Khin Shar, Eng Lieh Ouh, Hong Jin Kang, David Lo

TL;DR
This paper evaluates the effectiveness of security patch detection models trained on NVD data when applied in real-world scenarios, revealing significant performance drops and proposing a combined dataset approach for improvement.
Contribution
It demonstrates the limitations of NVD-based training data for in-the-wild patch detection and suggests a hybrid dataset method to enhance model robustness.
Findings
Models trained on NVD data perform poorly on in-the-wild patches.
NVD-linked patches differ significantly from in-the-wild patches in message and content.
Combining NVD data with manually identified patches improves detection robustness.
Abstract
Attacks can exploit zero-day or one-day vulnerabilities that are not publicly disclosed. To detect these vulnerabilities, security researchers monitor development activities in open-source repositories to identify unreported security patches. The sheer volume of commits makes this task infeasible to accomplish manually. Consequently, security patch detectors commonly trained and evaluated on security patches linked from vulnerability reports in the National Vulnerability Database (NVD). In this study, we assess the effectiveness of these detectors when applied in-the-wild. Our results show that models trained on NVD-derived data show substantially decreased performance, with decreases in F1-score of up to 90\% when tested on in-the-wild security patches, rendering them impractical for real-world use. An analysis comparing security patches identified in-the-wild and commits linked from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Web Application Security Vulnerabilities · Cybercrime and Law Enforcement Studies
