CIVeX: Causal Intervention Verification for Language Agents
Fabio Rovai

TL;DR
CIVeX is a causal intervention verifier for language agents that ensures actions have identifiable causal effects, improving safety and utility in confounded workflows.
Contribution
It introduces a novel causal intervention verification method that maps actions to causal queries and provides auditable verdicts, enhancing tool use reliability.
Findings
Zero false executions on Causal-ToolBench
84.9% accuracy under adversarial confounding
Cuts false-execution by >=50x over naive baselines
Abstract
A valid tool call is not necessarily a valid intervention. Tool-using language agents are guarded by schema validators, policy filters, provenance checks, state predictors, and self-verification, yet such safeguards do not certify that a state-changing action has an identifiable causal effect. In confounded workflows, the action that looks optimal in observational logs can reduce utility when executed. We introduce CIVeX, a causal intervention verifier that maps proposed actions to structural causal queries over a committed action-state graph, checks identifiability, and returns one of four auditable verdicts: EXECUTE, REJECT, EXPERIMENT, or ABSTAIN. Execution requires an assumption-scoped causal certificate carrying graph commitments, an identification argument, a one-sided lower confidence bound (LCB), provenance, and risk limits. On Causal-ToolBench (1,890 instances, 7 seeds),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
