An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

Md Hasan Saju; Maher Muhtadi; Akramul Azim

arXiv:2601.00254·cs.SE·January 5, 2026

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

Md Hasan Saju, Maher Muhtadi, Akramul Azim

PDF

Open Access

TL;DR

This study compares LLM-based methods for code vulnerability detection, showing that retrieval-augmented generation with external knowledge achieves the highest accuracy, and dual-agent systems improve reasoning transparency.

Contribution

It provides a comprehensive empirical evaluation of RAG, SFT, and dual-agent approaches for vulnerability detection, highlighting the benefits of external knowledge and multi-agent architectures.

Findings

01

RAG achieved 0.86 accuracy and 0.85 F1 score, outperforming other methods.

02

SFT with QLoRA adapters showed strong performance.

03

Dual-Agent system improved reasoning transparency and error mitigation.

Abstract

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in securing modern codebases. This paper presents a comparative study on the effectiveness of LLM-based techniques for detecting software vulnerabilities. The study evaluates three approaches, Retrieval-Augmented Generation (RAG), Supervised Fine-Tuning (SFT), and a Dual-Agent LLM framework, against a baseline LLM model. A curated dataset was compiled from Big-Vul and real-world code repositories from GitHub, focusing on five critical Common Weakness Enumeration (CWE) categories: CWE-119, CWE-399, CWE-264, CWE-20, and CWE-200. Our RAG approach, which integrated external domain knowledge from the internet and the MITRE CWE database, achieved the highest overall accuracy (0.86) and F1 score (0.85), highlighting the value of contextual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Web Application Security Vulnerabilities · Information and Cyber Security