Useful Memories Become Faulty When Continuously Updated by LLMs

Dylan Zhang; Yanshan Lin; Zhengkun Wu; Yihang Sun; Bingxuan Li; Dianqi Li; Hao Peng

arXiv:2605.12978·cs.AI·May 14, 2026

Useful Memories Become Faulty When Continuously Updated by LLMs

Dylan Zhang, Yanshan Lin, Zhengkun Wu, Yihang Sun, Bingxuan Li, Dianqi Li, Hao Peng

PDF

TL;DR

This paper investigates how continuous memory consolidation by LLMs can degrade memory accuracy, highlighting the importance of raw episodic data and controlled consolidation for reliable agent memory.

Contribution

It reveals that current LLM-based memory consolidation often introduces faults, and proposes that raw episodes should be preserved and consolidation should be explicitly managed.

Findings

01

Memory utility peaks then degrades with consolidation.

02

GPT-5.4 fails on 54% of previously solved problems after consolidation.

03

Preserving raw episodes and controlling consolidation improves accuracy.

Abstract

Learning from past experience benefits from two complementary forms of memory: episodic traces -- raw trajectories of what happened -- and consolidated abstractions distilled across many episodes into reusable, schema-like lessons. Recent agentic-memory systems pursue the consolidated form: an LLM rewrites past trajectories into a textual memory bank that it continuously updates with new interactions, promising self-improving agents without parameter updates. Yet we find that such consolidated memories produced by today's LLMs are often faulty even when derived from useful experiences. As consolidation proceeds, memory utility first rises, then degrades, and can fall below the no-memory baseline. More surprisingly, even when consolidating from ground-truth solutions, GPT-5.4 fails on 54% of a set of ARC-AGI problems it had previously solved without memory. We trace the regression to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.