A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models
Sabrina Kaniewski, Fabian Schmidt, Markus Enzweiler, Michael Menth, and Tobias Heer

TL;DR
This systematic literature review comprehensively analyzes 263 studies on LLM-based software vulnerability detection, categorizing methodologies, datasets, and limitations to provide a structured overview and guide future research in this rapidly evolving field.
Contribution
The paper offers the first detailed taxonomy and analysis of LLM-based vulnerability detection studies, addressing fragmentation and promoting reproducibility.
Findings
Identified key limitations in current approaches
Developed a fine-grained taxonomy of methods
Provided actionable future research directions
Abstract
The increasing adoption of Large Language Models (LLMs) in software engineering has sparked interest in their use for software vulnerability detection. However, the rapid development of this field has resulted in a fragmented research landscape, with diverse studies that are difficult to compare due to differences in, e.g., system designs and dataset usage. This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully. In this work, we present a comprehensive systematic literature review (SLR) of LLM-based software vulnerability detection. We analyze 263 studies published between January 2020 and November 2025, categorizing them by task formulation, input representation, system architecture, and techniques. Further, we analyze the datasets used, including their characteristics, vulnerability coverage, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
