When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents

Xiaoyu Xu; Minxin Du; Qipeng Xie; Haobin Ke; Qingqing Ye; Haibo Hu

arXiv:2605.06731·cs.CR·May 11, 2026

When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents

Xiaoyu Xu, Minxin Du, Qipeng Xie, Haobin Ke, Qingqing Ye, Haibo Hu

PDF

TL;DR

This paper reveals that routine interactions with personalized LLM agents can unintentionally poison their long-term state, leading to security vulnerabilities, and introduces benchmarks and defenses to address this issue.

Contribution

It formalizes the risk of long-term state poisoning, introduces the ULSPB benchmark, and proposes StateGuard as an effective mitigation strategy.

Findings

01

Routine conversations can significantly poison long-term state.

02

StateGuard effectively reduces authorization drift and tool-use escalation.

03

Synthetic and real-world interactions confirm the poisoning risk.

Abstract

Personalized LLM agents maintain persistent cross-session state to support long-horizon collaboration. Yet, this persistence introduces a subtle but critical security vulnerability: routine user-agent interactions can gradually reshape an agent's long-term state, inadvertently weakening future confirmation boundaries, expanding tool-use defaults, and escalating autonomous behavior over time. We formalize this risk as \textbf{unintended long-term state poisoning}. To systematically study it, we introduce the \textbf{Unintended Long-Term State Poisoning Bench (ULSPB)}, a bilingual benchmark comprising $350$ settings spanning five assistance categories, seven interaction patterns, 24-turn routine interactions, and matched single-injection counterparts. Furthermore, we define the \emph{Harm Score} (HS), a state-centric metric that quantifies \emph{authorization drift}, \emph{tool-use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.