Diverse LLMs vs. Vulnerabilities: Who Detects and Fixes Them Better?
Arastoo Zibaeirad, Marco Vieira

TL;DR
This paper introduces DVDR-LLM, an ensemble framework that combines diverse Large Language Models to improve software vulnerability detection and repair, showing significant accuracy gains but also trade-offs in false positive and negative rates.
Contribution
The paper proposes DVDR-LLM, a novel ensemble approach that leverages multiple LLMs to enhance vulnerability detection and repair performance, especially on complex code.
Findings
10-12% higher detection accuracy than individual models
18% improvement in recall for multi-file vulnerabilities
Increased false negatives in detection tasks due to ensemble thresholds
Abstract
Large Language Models (LLMs) are increasingly being studied for Software Vulnerability Detection (SVD) and Repair (SVR). Individual LLMs have demonstrated code understanding abilities, but they frequently struggle when identifying complex vulnerabilities and generating fixes. This study presents DVDR-LLM, an ensemble framework that combines outputs from diverse LLMs to determine whether aggregating multiple models reduces error rates. Our evaluation reveals that DVDR-LLM achieves 10-12% higher detection accuracy compared to the average performance of individual models, with benefits increasing as code complexity grows. For multi-file vulnerabilities, the ensemble approach demonstrates significant improvements in recall (+18%) and F1 score (+11.8%) over individual models. However, the approach raises measurable trade-offs: reducing false positives in verification tasks while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Software Engineering Research · Web Application Security Vulnerabilities
