LoopTrap: Termination Poisoning Attacks on LLM Agents
Huiyu Xu, Zhibo Wang, Wenhui Zhang, Ziqi Zhu, Yaopeng Wang, Kui Ren, Chun Chen

TL;DR
This paper reveals a new security risk called Termination Poisoning in LLM agents' iterative loops, demonstrating how malicious prompts can cause unbounded computation and proposing LoopTrap, an automated framework to craft such attacks.
Contribution
It systematically characterizes Termination Poisoning attacks, develops 10 attack strategies, and introduces LoopTrap, an adaptive red-teaming framework that automates malicious prompt synthesis.
Findings
LoopTrap achieves 3.57× step amplification on average across 8 agents.
Different agents exhibit distinct vulnerabilities to attack strategies.
LoopTrap's adaptive approach outperforms manual prompt crafting.
Abstract
Modern LLM agents solve complex tasks by operating in iterative execution loops, where they repeatedly reason, act, and self-evaluate progress to determine when a task is complete. In this work, we show that while this self-directed loop facilitates autonomy, it also introduces a critical risk: by injecting malicious prompts into the agent's context, an adversary can distort the agent's termination judgment, making it believe the task remains incomplete and leading to unbounded computation.To understand this threat, we define and systematically characterize it as Termination Poisoning and design 10 representative attack strategies. Through a empirical study spanning 8 LLM agents and 60 tasks, we demonstrate that different LLM agents exhibit distinct behavioral signatures that determine which strategies succeed. These transferable patterns can serve as principled guidance for crafting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
