TL;DR
This paper introduces a multi-agent system combining symbolic execution and large language models to detect memory vulnerabilities in incomplete Rust code snippets, significantly improving over existing tools.
Contribution
The novel multi-agent architecture synthesizes KLEE-compatible harnesses from incomplete code, enabling effective vulnerability detection where prior tools fail.
Findings
Achieved 90.3% wrapper compilation success on 31 CVEs
Detected 1,206 critical errors in Rust code
Reduced wrapper failures from 42% to 9.7% compared to single-agent baseline
Abstract
This paper presents a system combining symbolic execution (KLEE) with a 4-agent multi-LLM architecture for detecting memory vulnerabilities in Rust unsafe code. A central challenge we address is the incomplete-code problem: CVE database entries provide only isolated code snippets that lack struct definitions, imports, and Cargo manifests, causing all existing formal verification tools to fail at compilation with zero output. Our system resolves this through four specialized agents -- an Oracle/Validator for strategic planning, a Safety Checker for vulnerability analysis, a Code Specialist for FFI wrapper generation, and a Fast Filter for execution optimization -- that collaboratively synthesize KLEE-compatible harnesses from otherwise uncompilable fragments. KLEE's output is then ingested by graph_klee.py, which constructs a Graph Database linking CVE files, CWE categories, error types,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
