MultiVer: Zero-Shot Multi-Agent Vulnerability Detection
Shreshth Rajan

TL;DR
MultiVer introduces a zero-shot multi-agent system for vulnerability detection that surpasses fine-tuned models in recall on key benchmarks, demonstrating the effectiveness of ensemble approaches without additional training.
Contribution
The paper presents MultiVer, a novel zero-shot multi-agent ensemble system that achieves state-of-the-art vulnerability detection recall, surpassing fine-tuned models on major benchmarks.
Findings
MultiVer achieves 82.7% recall on PyVul, exceeding fine-tuned GPT-3.5.
On SecurityEval, MultiVer attains 91.7% detection rate, matching specialized systems.
Ensemble architecture adds 17 percentage points recall over single-agent analysis.
Abstract
We present MultiVer, a zero-shot multi-agent system for vulnerability detection that achieves state-of-the-art recall without fine-tuning. A four-agent ensemble (security, correctness, performance, style) with union voting achieves 82.7% recall on PyVul, exceeding fine-tuned GPT-3.5 (81.3%) by 1.4 percentage points -- the first zeroshot system to surpass fine-tuned performance on this benchmark. On SecurityEval, the same architecture achieves 91.7% detection rate, matching specialized systems. The recall improvement comes at a precision cost: 48.8% precision versus 63.9% for fine-tuned baselines, yielding 61.4% F1. Ablation experiments isolate component contributions: the multi-agent ensemble adds 17 percentage points recall over single-agent security analysis. These results demonstrate that for security applications where false negatives are costlier than false positives, zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Spam and Phishing Detection · Adversarial Robustness in Machine Learning
