$\delta$-mem: Efficient Online Memory for Large Language Models
Jingdi Lei, Di Zhang, Junxian Li, Weida Wang, Kaixuan Fan, Xiang Liu, Qihan Liu, Xiaoteng Ma, Baian Chen, Soujanya Poria

TL;DR
$4mem4 is a lightweight online memory mechanism that enhances large language models' ability to utilize historical information efficiently without full fine-tuning.
Contribution
It introduces $4mem, a compact online associative memory that improves long-term information retention in language models with minimal overhead.
Findings
$4mem$ improves model scores by 1.10x on average over the backbone.
It achieves 1.31x improvement on MemoryAgentBench.
Effective memory can be integrated without full fine-tuning or context extension.
Abstract
Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose -mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory. -mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation. With only an online memory state, -mem improves the average score to that of the frozen backbone and that of the strongest non--mem memory baseline. It achieves larger gains on memory-heavy benchmarks, reaching on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
