Interpreting Agentic Systems: Beyond Model Explanations to System-Level Accountability

Judy Zhu; Dhari Gandhi; Himanshu Joshi; Ahmad Rezaie Mianroodi; Sedef Akinli Kocak; Dhanesh Ramachandran

arXiv:2601.17168·cs.AI·January 27, 2026

Interpreting Agentic Systems: Beyond Model Explanations to System-Level Accountability

Judy Zhu, Dhari Gandhi, Himanshu Joshi, Ahmad Rezaie Mianroodi, Sedef Akinli Kocak, Dhanesh Ramachandran

PDF

Open Access 1 Datasets

TL;DR

This paper examines the limitations of existing interpretability methods for agentic AI systems, emphasizing the need for new techniques to ensure safety, accountability, and oversight across their complex, goal-directed behaviors.

Contribution

It identifies gaps in current interpretability approaches for agentic systems and proposes future research directions for developing system-level oversight tools.

Findings

01

Current interpretability methods are insufficient for agentic systems.

02

Agentic systems require new, tailored interpretability techniques.

03

Embedding oversight mechanisms is crucial for safe deployment.

Abstract

Agentic systems have transformed how Large Language Models (LLMs) can be leveraged to create autonomous systems with goal-directed behaviors, consisting of multi-step planning and the ability to interact with different environments. These systems differ fundamentally from traditional machine learning models, both in architecture and deployment, introducing unique AI safety challenges, including goal misalignment, compounding decision errors, and coordination risks among interacting agents, that necessitate embedding interpretability and explainability by design to ensure traceability and accountability across their autonomous behaviors. Current interpretability techniques, developed primarily for static models, show limitations when applied to agentic systems. The temporal dynamics, compounding decisions, and context-dependent behaviors of agentic systems demand new analytical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Kylan12/Synthetic-AI-ML-Dataset
dataset· 42 dl
42 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multi-Agent Systems and Negotiation · AI-based Problem Solving and Planning