It Only Gets Worse: Revisiting DL-Based Vulnerability Detectors from a Practical Perspective

Yunqian Wang; Xiaohong Li; Yao Zhang; Yuekang Li; Zhiping Zhou; Ruitao Feng

arXiv:2507.09529·cs.SE·July 15, 2025

It Only Gets Worse: Revisiting DL-Based Vulnerability Detectors from a Practical Perspective

Yunqian Wang, Xiaohong Li, Yao Zhang, Yuekang Li, Zhiping Zhou, Ruitao Feng

PDF

Open Access

TL;DR

This paper critically evaluates deep learning-based vulnerability detectors, revealing their limitations in real-world scenarios, and introduces VulTegra, a framework that uncovers key factors influencing detection performance and suggests improvements.

Contribution

The paper presents VulTegra, a comprehensive evaluation framework that compares DL models for vulnerability detection and identifies factors affecting their effectiveness and limitations.

Findings

01

State-of-the-art detectors have low consistency and limited real-world effectiveness.

02

Pre-trained models are not always superior to scratch-trained models but have specific strengths.

03

Adjusting key factors improves recall and F1 scores across multiple detectors.

Abstract

With the growing threat of software vulnerabilities, deep learning (DL)-based detectors have gained popularity for vulnerability detection. However, doubts remain regarding their consistency within declared CWE ranges, real-world effectiveness, and applicability across scenarios. These issues may lead to unreliable detection, high false positives/negatives, and poor adaptability to emerging vulnerabilities. A comprehensive analysis is needed to uncover critical factors affecting detection and guide improvements in model design and deployment. In this paper, we present VulTegra, a novel evaluation framework that conducts a multidimensional comparison of scratch-trained and pre-trained-based DL models for vulnerability detection. VulTegra reveals that state-of-the-art (SOTA) detectors still suffer from low consistency, limited real-world capabilities, and scalability challenges. Contrary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications