Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents
Ahmad Al-Tawaha, Shangding Gu, Peizhi Niu, Ruoxi Jia, and Ming Jin

TL;DR
This paper introduces a new protocol to evaluate how memory accumulation over time affects the safety of memory-equipped LLM agents across multiple tasks, revealing increasing risks with longer exposure.
Contribution
It proposes a longitudinal evaluation protocol for memory safety in LLM agents, highlighting the importance of temporal assessment over single-state snapshots.
Findings
Memory-induced violation rates increase with exposure length.
Memory safety is primarily driven by accumulated content, not encounter order.
Memory-induced risk can be detected before generation using a diagnostic monitor.
Abstract
Safety evaluations of memory-equipped LLM agents typically measure within-task safety: whether an agent completes a single scenario safely, often under adversarial conditions such as prompt injection or memory poisoning. In deployment, however, a single agent serves many independent tasks over a long horizon, and memory accumulated during earlier tasks can affect behavior on later, unrelated ones. Studying this regime requires evaluation along the temporal dimension across tasks: not whether an agent is safe at any single memory state, but how its safety profile changes as memory accumulates across many independent interactions. We call this failure mode temporal memory contamination. To isolate memory exposure from stream non-stationarity, we introduce a trigger-probe protocol that evaluates a fixed probe set against read-only memory snapshots at varying prefix lengths, together with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
