LLM-based Vulnerability Detection at Project Scale: An Empirical Study

Fengjie Li; Jiajun Jiang; Dongchi Chen; Yingfei Xiong

arXiv:2601.19239·cs.SE·January 28, 2026

LLM-based Vulnerability Detection at Project Scale: An Empirical Study

Fengjie Li, Jiajun Jiang, Dongchi Chen, Yingfei Xiong

PDF

Open Access

TL;DR

This empirical study compares LLM-based vulnerability detectors with traditional static analyzers at the project scale, revealing their strengths, limitations, and practical challenges in real-world software security assessment.

Contribution

It provides the first comprehensive evaluation of LLM-based vulnerability detection methods against traditional tools on large-scale projects, highlighting their detection capabilities and limitations.

Findings

01

LLM-based detectors find more unique vulnerabilities despite low recall.

02

Both tools produce many false positives, limiting practical usability.

03

High computational costs and failure modes hinder current LLM-based approaches.

Abstract

In this paper, we present the first comprehensive empirical study of specialized LLM-based detectors and compare them with traditional static analyzers at the project scale. Specifically, our study evaluates five latest and representative LLM-based methods and two traditional tools using: 1) an in-house benchmark of 222 known real-world vulnerabilities (C/C++ and Java) to assess detection capability, and 2) 24 active open-source projects, where we manually inspected 385 warnings to assess their practical usability and underlying root causes of failures. Our evaluation yields three key findings: First, while LLM-based detectors exhibit low recall on the in-house benchmark, they still uncover more unique vulnerabilities than traditional tools. Second, in open-source projects, both LLM-based and traditional tools generate substantial warnings but suffer from very high false discovery…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation and Cyber Security · Software Engineering Research · Web Application Security Vulnerabilities