XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows
Xinru Wang, Ming Yin, Eunyee Koh, Mustafa Doga Dogan

TL;DR
XAgen is an explainability tool designed to help users identify, understand, and correct failures in multi-agent workflows powered by LLMs, supporting diverse user expertise levels.
Contribution
The paper introduces XAgen, a novel explainability tool with visualization, human feedback, and automatic error detection, tailored for debugging multi-agent LLM systems.
Findings
XAgen improves failure localization and attribution.
Users can iteratively refine workflows more effectively.
The study provides design guidelines for explainable agentic AI.
Abstract
As multi-agent systems powered by Large Language Models (LLMs) are increasingly adopted in real-world workflows, users with diverse technical backgrounds are now building and refining their own agentic processes. However, these systems can fail in opaque ways, making it difficult for users to observe, understand, and correct errors. We conducted formative interviews with 12 practitioners to identify mismatches between existing debugging tools and users' needs. Based on these insights, we designed XAgen, an explainability tool that supports users with varying AI expertise through three core capabilities: log visualization for glanceable workflow understanding, human-in-the-loop feedback to capture expert judgment, and automatic error detection via an LLM-as-a-judge. In a user study with 8 participants, XAgen helped users locate failures more easily, attribute to specific agents or steps,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Ethics and Social Impacts of AI
