Direction for Detection: A Survey of Automated Vulnerability Detection and all of its Pain Points
Dan Ristea, Shae McFadden, Ezzeldin Shereen, Madeleine Dwyer, Sanyam Vyas, Chris Hicks, Vasilios Mavroudis

TL;DR
This survey analyzes 87 influential works on automated vulnerability detection, identifies systemic pain points, and offers recommendations to improve research practices and broaden the field's scope.
Contribution
It systematically categorizes existing research, uncovers interconnected flaws, and proposes concrete solutions to address core issues in ML-based vulnerability detection.
Findings
Field mainly focuses on binary classification of C/C++ vulnerabilities at function level.
Identified feedback loops reinforce narrow problem formulation and dataset biases.
Recommendations aim to diversify vulnerability types, languages, and detection granularity.
Abstract
Security vulnerabilities in software can have severe consequences; however, manual vulnerability detection is costly and does not scale, especially as agentic coding frameworks increase the rate of code production. Over the last decade, a large body of research has applied machine learning machine learning to automate vulnerability detection (ML4AVD), yet self-reported performance on the most popular datasets shows no clear upward trend. The ML4AVD research community has identified several flaws in problem formulations, datasets, and metrics, but these are discussed in isolation, leaving the overarching problems that generate and reinforce these flaws unaddressed. We first systematize the field through a survey of 87 influential works based on their problem formulation, input and detection granularity, target programming languages, evaluation metrics, datasets, and detection approach.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
