MASPrism: Lightweight Failure Attribution for Multi-Agent Systems Using Prefill-Stage Signals
Yang Liu, Hongjiang Feng, Junsong Pu, and Zhuangbin Chen

TL;DR
MASPrism is a lightweight failure attribution framework for multi-agent systems that uses prefill-stage signals from small language models to identify failure sources efficiently.
Contribution
It introduces a novel two-pass prefill-stage signal analysis method that improves failure source identification without costly workflows or training.
Findings
Achieves up to 33.41% improvement in Top-1 accuracy on Who&When-HC dataset.
Outperforms proprietary LLMs like Gemini-2.5-Pro on TRAIL dataset.
Processes traces in 2.66 seconds, 6.69 times faster than baseline.
Abstract
Failure attribution in LLM-based multi-agent systems aims to identify the steps that contribute to a failed execution. This task remains difficult because a single execution can contain many agent actions and tool calls, failure evidence can appear many steps after the original mistake, and existing methods often rely on costly agent workflows, replay, or training on synthetic failure logs. To address these challenges, we propose MASPrism, a lightweight framework that performs failure attribution using prefill-stage signals from a small language model (SLM). MASPrism first extracts token-level negative log-likelihood and attention weights during a prefill pass to identify symptom-like steps and earlier candidate sources, without decoding. It then reconstructs a focused diagnostic prompt and performs a second prefill pass to rank failure-source candidates. Using Qwen3-0.6B as the SLM,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
