Project Ariadne: A Structural Causal Framework for Auditing Faithfulness in LLM Agents
Sourena Khanzadeh

TL;DR
Project Ariadne introduces a causal auditing framework for LLM agents that assesses the faithfulness of their reasoning traces using structural causal models and counterfactual interventions, revealing widespread decoupling between reasoning and decisions.
Contribution
It presents a novel causal framework for auditing LLM reasoning, highlighting the prevalence of unfaithful explanations and proposing the Ariadne Score as a new benchmark.
Findings
Detected a faithfulness gap in state-of-the-art models
Identified a causal decoupling failure mode with high violation density
Showed reasoning traces often do not influence final decisions
Abstract
As Large Language Model (LLM) agents are increasingly tasked with high-stakes autonomous decision-making, the transparency of their reasoning processes has become a critical safety concern. While \textit{Chain-of-Thought} (CoT) prompting allows agents to generate human-readable reasoning traces, it remains unclear whether these traces are \textbf{faithful} generative drivers of the model's output or merely \textbf{post-hoc rationalizations}. We introduce \textbf{Project Ariadne}, a novel XAI framework that utilizes Structural Causal Models (SCMs) and counterfactual logic to audit the causal integrity of agentic reasoning. Unlike existing interpretability methods that rely on surface-level textual similarity, Project Ariadne performs \textbf{hard interventions} (-calculus) on intermediate reasoning nodes -- systematically inverting logic, negating premises, and reversing factual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI
