Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

Jugal Gajjar

arXiv:2604.10800·cs.SE·April 14, 2026

Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

Jugal Gajjar

PDF

TL;DR

This paper introduces a cross-language vulnerability analysis framework using LLMs, grounded in execution confirmation, achieving high accuracy and explainability for trustworthy AI in software security.

Contribution

It presents a novel unified vulnerability lifecycle framework with execution-grounded validation and cross-language generalization via a shared structural schema and hybrid embeddings.

Findings

01

Achieves 89.84-92.02% intra-language detection accuracy.

02

Attains 74.43-80.12% zero-shot cross-language F1.

03

Resolves 69.74% of vulnerabilities end-to-end.

Abstract

Learned classifiers deployed in agentic pipelines face a fundamental reliability problem: predictions are probabilistic inferences, not verified conclusions, and acting on them without grounding in observable evidence leads to compounding failures across downstream stages. Software vulnerability analysis makes this cost concrete and measurable. We address this through a unified cross-language vulnerability lifecycle framework built around three LLM-driven reasoning stages-hybrid structural-semantic detection, execution-grounded agentic validation, and validation-aware iterative repair-governed by a strict invariant: no repair action is taken without execution-based confirmation of exploitability. Cross-language generalization is achieved via a Universal Abstract Syntax Tree (uAST) normalizing Java, Python, and C++ into a shared structural schema, combined with a hybrid fusion of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.