A Dynamic Retrieval-Augmented Generation System with Selective Memory and Remembrance
Okan Bursa

TL;DR
This paper presents Adaptive RAG Memory (ARM), a dynamic retrieval-augmented generation framework that uses selective remembrance and decay, improving efficiency and performance in knowledge retention and retrieval tasks.
Contribution
The paper introduces a novel dynamic memory system for RAG that enhances retrieval efficiency and interpretability without retraining, inspired by cognitive principles.
Findings
ARM achieves near state-of-the-art performance with fewer parameters.
Dynamic RAG with ARM improves retrieval speed and coverage.
The system offers a practical trade-off between quality, latency, and memory efficiency.
Abstract
We introduce \emph{Adaptive RAG Memory} (ARM), a retrieval-augmented generation (RAG) framework that replaces a static vector index with a \emph{dynamic} memory substrate governed by selective remembrance and decay. Frequently retrieved items are consolidated and protected from forgetting, while rarely used items gradually decay, inspired by cognitive consolidation and forgetting principles. On a lightweight retrieval benchmark, ARM reaches near state-of-the-art performance (e.g., NDCG@5 0.940, Recall@5 ) with only 22M parameters in the embedding layer, achieving the best efficiency among ultra-efficient models (25M parameters). In addition, we compare static vs. dynamic RAG combinations across Llama 3.1 and GPT-4o. Llama 3.1 with static RAG achieves the highest key-term coverage (67.2\%) at moderate latency, while GPT-4o with a dynamic selective retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Multimodal Machine Learning Applications
