From Features to Actions: Explainability in Traditional and Agentic AI Systems
Sindhuja Chaduvula, Jessee Ho, Kina Kim, Aravind Narayanan, Mahshid Alinoori, Muskan Garg, Dhanesh Ramachandram, Shaina Raza

TL;DR
This paper compares explainability methods for static and agentic AI systems, revealing that trace-based diagnostics are more effective than attribution methods in diagnosing failures over multi-step trajectories.
Contribution
It introduces a comparative analysis of attribution-based and trace-based explanations, highlighting the need for trajectory-level explainability in agentic AI systems.
Findings
Attribution methods are stable in static settings (Spearman 0.86).
Trace-based diagnostics effectively localize behavior failures.
State tracking inconsistency is 2.7 times more common in failed agentic runs.
Abstract
Over the last decade, explainable AI has primarily focused on interpreting individual model predictions, producing post-hoc explanations that relate inputs to outputs under a fixed decision structure. Recent advances in large language models (LLMs) have enabled agentic AI systems whose behaviour unfolds over multi-step trajectories. In these settings, success and failure are determined by sequences of decisions rather than a single output. While useful, it remains unclear how explanation approaches designed for static predictions translate to agentic settings where behaviour emerges over time. In this work, we bridge the gap between static and agentic explainability by comparing attribution-based explanations with trace-based diagnostics across both settings. To make this distinction explicit, we empirically compare attribution-based explanations used in static classification tasks with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Topic Modeling
