MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution
Zihan Wu, Jie Xu, Yun Peng, Chun Yong Chong, Xiaohua Jia

TL;DR
MulVul is a retrieval-augmented multi-agent framework that improves vulnerability detection across diverse patterns by combining coarse-to-fine classification with cross-model prompt evolution for automated prompt optimization.
Contribution
This paper introduces MulVul, a novel multi-agent framework with retrieval and cross-model prompt evolution to enhance vulnerability detection and automate prompt engineering.
Findings
Achieves 34.79% Macro-F1 on 130 CWE types, outperforming baselines.
Cross-model prompt evolution boosts performance by 51.6%.
Effectively handles diverse vulnerability patterns with automated prompts.
Abstract
Large Language Models (LLMs) struggle to automate real-world vulnerability detection due to two key limitations: the heterogeneity of vulnerability patterns undermines the effectiveness of a single unified model, and manual prompt engineering for massive weakness categories is unscalable. To address these challenges, we propose \textbf{MulVul}, a retrieval-augmented multi-agent framework designed for precise and broad-coverage vulnerability detection. MulVul adopts a coarse-to-fine strategy: a \emph{Router} agent first predicts the top- coarse categories and then forwards the input to specialized \emph{Detector} agents, which identify the exact vulnerability types. Both agents are equipped with retrieval tools to actively source evidence from vulnerability knowledge bases to mitigate hallucinations. Crucially, to automate the generation of specialized prompts, we design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Advanced Malware Detection Techniques
