Using Multiple Code Representations to Prioritize Static Analysis Warnings
Thanh Trong Vu, Hieu Dinh Vo

TL;DR
This paper presents VulRG, a deep learning-based approach that effectively ranks static analysis warnings by their likelihood of being true positives, significantly reducing developer effort in vulnerability detection.
Contribution
VulRG combines CNN and BiGRU models to predict and prioritize static analysis warnings, improving accuracy over existing methods.
Findings
Recall at Top-50% is 90.9%
VulRG improves Precision and Recall by +30% at Top-5%
Effective on a dataset of 6,620 warnings
Abstract
In order to ensure the quality of software and prevent attacks from hackers on critical systems, static analysis tools are frequently utilized to detect vulnerabilities in the early development phase. However, these tools often report a large number of warnings with a high false-positive rate, which causes many difficulties for developers. In this paper, we introduce VulRG, a novel approach to address this problem. Specifically, VulRG predicts and ranks the warnings based on their likelihoods to be true positive. To predict that likelihood, VulRG combines two deep learning models CNN and BiGRU to capture the context of each warning in terms of program syntax, control flow, and program dependence. Our experimental results on a real-world dataset of 6,620 warnings show that VulRG's Recall at Top-50% is 90.9%. This means that using VulRG, 90% of the vulnerabilities can be found by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Information and Cyber Security
