PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors

Xinmiao Huang; Jinwei Hu; Rajarshi Roy; Changshun Wu; Yi Dong; Xiaowei Huang

arXiv:2605.06455·cs.AI·May 8, 2026

PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors

Xinmiao Huang, Jinwei Hu, Rajarshi Roy, Changshun Wu, Yi Dong, Xiaowei Huang

PDF

1 Repo

TL;DR

PrefixGuard is a framework for creating online failure-warning monitors for LLM agents, using trace analysis and supervised learning to improve early detection of failures in tool-using tasks.

Contribution

It introduces a novel trace-to-monitor framework with an offline induction step and supervised training, improving early failure detection over raw text controls.

Findings

01

PrefixGuard monitors achieve high AUPRC scores across multiple benchmarks.

02

Monitors outperform raw-text controls by an average of +0.137 AUPRC.

03

Finite-state automata remain compact on some benchmarks but expand on others.

Abstract

Large language model (LLM) agents now execute long, tool-using tasks where final outcome checks can arrive too late for intervention. Online warning requires lightweight prefix monitors over heterogeneous traces, but hand-authored event schemas are brittle and deployment-time LLM judging is costly. We introduce PrefixGuard, a trace-to-monitor framework with an offline StepView induction step followed by supervised monitor training. StepView induces deterministic typed-step adapters from raw trace samples, and the monitor learns an event abstraction and prefix-risk scorer from terminal outcomes. Across WebArena, $τ^{2}$ -Bench, SkillsBench, and TerminalBench, the strongest PrefixGuard monitors reach 0.900/0.710/0.533/0.557 AUPRC. Using the strongest backend within each representation, they improve over raw-text controls by an average of +0.137 AUPRC. LLM judges remain substantially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shinmohuang/PrefixGuard
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.