MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution

Zihan Wu; Jie Xu; Yun Peng; Chun Yong Chong; Xiaohua Jia

arXiv:2601.18847·cs.SE·January 28, 2026

MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution

Zihan Wu, Jie Xu, Yun Peng, Chun Yong Chong, Xiaohua Jia

PDF

Open Access

TL;DR

MulVul is a retrieval-augmented multi-agent framework that improves vulnerability detection across diverse patterns by combining coarse-to-fine classification with cross-model prompt evolution for automated prompt optimization.

Contribution

This paper introduces MulVul, a novel multi-agent framework with retrieval and cross-model prompt evolution to enhance vulnerability detection and automate prompt engineering.

Findings

01

Achieves 34.79% Macro-F1 on 130 CWE types, outperforming baselines.

02

Cross-model prompt evolution boosts performance by 51.6%.

03

Effectively handles diverse vulnerability patterns with automated prompts.

Abstract

Large Language Models (LLMs) struggle to automate real-world vulnerability detection due to two key limitations: the heterogeneity of vulnerability patterns undermines the effectiveness of a single unified model, and manual prompt engineering for massive weakness categories is unscalable. To address these challenges, we propose \textbf{MulVul}, a retrieval-augmented multi-agent framework designed for precise and broad-coverage vulnerability detection. MulVul adopts a coarse-to-fine strategy: a \emph{Router} agent first predicts the top- $k$ coarse categories and then forwards the input to specialized \emph{Detector} agents, which identify the exact vulnerability types. Both agents are equipped with retrieval tools to actively source evidence from vulnerability knowledge bases to mitigate hallucinations. Crucially, to automate the generation of specialized prompts, we design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Advanced Malware Detection Techniques