Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis
Jugal Gajjar

TL;DR
This paper introduces a cross-language vulnerability analysis framework using LLMs, grounded in execution confirmation, achieving high accuracy and explainability for trustworthy AI in software security.
Contribution
It presents a novel unified vulnerability lifecycle framework with execution-grounded validation and cross-language generalization via a shared structural schema and hybrid embeddings.
Findings
Achieves 89.84-92.02% intra-language detection accuracy.
Attains 74.43-80.12% zero-shot cross-language F1.
Resolves 69.74% of vulnerabilities end-to-end.
Abstract
Learned classifiers deployed in agentic pipelines face a fundamental reliability problem: predictions are probabilistic inferences, not verified conclusions, and acting on them without grounding in observable evidence leads to compounding failures across downstream stages. Software vulnerability analysis makes this cost concrete and measurable. We address this through a unified cross-language vulnerability lifecycle framework built around three LLM-driven reasoning stages-hybrid structural-semantic detection, execution-grounded agentic validation, and validation-aware iterative repair-governed by a strict invariant: no repair action is taken without execution-based confirmation of exploitability. Cross-language generalization is achieved via a Universal Abstract Syntax Tree (uAST) normalizing Java, Python, and C++ into a shared structural schema, combined with a hybrid fusion of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
