Syntax- and Compilation-Preserving Evasion of LLM Vulnerability Detectors
Luze Sun, Alina Oprea, Eric Wong

TL;DR
This paper evaluates the robustness of LLM-based vulnerability detectors against syntax- and compilation-preserving code transformations, revealing a significant vulnerability to evasion attacks even with high baseline accuracy.
Contribution
It introduces Complete Resistance (CR) as a metric and demonstrates that current models are highly susceptible to behavior-preserving code edits, exposing security gaps.
Findings
Models with over 70% recall have CR as low as 0.12%.
Universal adversarial strings transfer effectively to black-box APIs like GPT-4o.
On-target optimization increases evasion success up to 92.5% ASR.
Abstract
LLM-based vulnerability detectors are increasingly deployed in CI/CD security gating, yet their resilience to evasion under syntax- and compilation-preserving edits remains poorly understood. We evaluate five attack variants spanning four carrier families of behavior-preserving code transformations on a unified C/C++ benchmark () and introduce Complete Resistance (CR), measuring the fraction of correctly detected vulnerabilities that withstand all attack variants. Our findings reveal a significant robustness gap: models achieving 70\%+ clean recall exhibit CR as low as 0.12\%, meaning over 87\% of detected vulnerabilities can be evaded by at least one syntax-preserving edit. Universal adversarial strings optimized on a 14B surrogate transfer effectively to black-box APIs including GPT-4o, while on-target optimization further amplifies evasion (up to 92.5\% ASR). These results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
