TrajAD: Trajectory Anomaly Detection for Trustworthy LLM Agents
Yibing Liu, Chong Zhang, Zhongyi Han, Hansong Liu, Yong Wang, Yang Yu, Xiaoyan Wang, Yilong Yin

TL;DR
This paper introduces TrajAD, a method for detecting and localizing trajectory anomalies in LLM agents to improve trustworthiness, supported by a new benchmark dataset and specialized supervision techniques.
Contribution
The paper presents TrajAD, a novel approach for precise anomaly detection and localization in LLM process trajectories, along with a new benchmark dataset, TrajBench.
Findings
General-purpose LLMs struggle with anomaly localization.
Specialized supervision improves anomaly detection accuracy.
TrajAD outperforms baseline methods on TrajBench.
Abstract
We address the problem of runtime trajectory anomaly detection, a critical capability for enabling trustworthy LLM agents. Current safety measures predominantly focus on static input/output filtering. However, we argue that ensuring LLM agents reliability requires auditing the intermediate execution process. In this work, we formulate the task of Trajectory Anomaly Detection. The goal is not merely detection, but precise error localization. This capability is essential for enabling efficient rollback-and-retry. To achieve this, we construct TrajBench, a dataset synthesized via a perturb-and-complete strategy to cover diverse procedural anomalies. Using this benchmark, we investigate the capability of models in process supervision. We observe that general-purpose LLMs, even with zero-shot prompting, struggle to identify and localize these anomalies. This reveals that generalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Formal Methods in Verification
