Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Enhanced Model Architectures
Parsa Omidi, Xingshuai Huang, Axel Laborieux, Bahareh Nikpour, Tianyu Shi, and Armaghan Eshaghi

TL;DR
This paper systematically reviews how neuroscience principles inform the development of Memory-Augmented Transformers, highlighting recent advances, core operations, challenges, and future directions for lifelong learning models.
Contribution
It offers a unified framework connecting neuroscience and engineering in Memory-Augmented Transformers, organizing recent progress and identifying key challenges and solutions.
Findings
Shift from static caches to adaptive, test-time learning systems
Identification of challenges in scalability and interference
Emerging solutions like hierarchical buffering and surprise-gated updates
Abstract
Memory is fundamental to intelligence, enabling learning, reasoning, and adaptability across biological and artificial systems. While Transformer architectures excel at sequence modeling, they face critical limitations in long-range context retention, continual learning, and knowledge integration. This review presents a unified framework bridging neuroscience principles, including dynamic multi-timescale memory, selective attention, and consolidation, with engineering advances in Memory-Augmented Transformers. We organize recent progress through three taxonomic dimensions: functional objectives (context extension, reasoning, knowledge integration, adaptation), memory representations (parameter-encoded, state-based, explicit, hybrid), and integration mechanisms (attention fusion, gated control, associative retrieval). Our analysis of core memory operations (reading, writing, forgetting,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · EEG and Brain-Computer Interfaces · Ferroelectric and Negative Capacitance Devices
