RAG without Forgetting: Continual Query-Infused Key Memory
Yuntong Hu, Sha Li, Naren Ramakrishnan, Liang Zhao

TL;DR
This paper introduces ERM, a training-free method that transforms transient query-time improvements into persistent, stable retrieval index updates, enhancing RAG systems without additional inference costs.
Contribution
ERM is a novel, training-free framework that updates retrieval indices through correctness feedback, enabling continual learning and improved retrieval in RAG systems.
Findings
Consistent gains in retrieval and generation across 13 domains.
Significant improvements on reasoning-intensive tasks.
Zero inference-time overhead for index updates.
Abstract
Retrieval-augmented generation (RAG) systems commonly improve robustness via query-time adaptations such as query expansion and iterative retrieval. While effective, these approaches are inherently stateless: adaptations are recomputed for each query and discarded thereafter, precluding cumulative learning and repeatedly incurring inference-time cost. Index-side approaches like key expansion introduce persistence but rely on offline preprocessing or heuristic updates that are weakly aligned with downstream task utility, leading to semantic drift and noise accumulation. We propose Evolving Retrieval Memory (ERM), a training-free framework that transforms transient query-time gains into persistent retrieval improvements. ERM updates the retrieval index through correctness-gated feedback, selectively attributes atomic expansion signals to the document keys they benefit, and progressively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Personal Information Management and User Behavior
