Memory Bank Compression for Continual Adaptation of Large Language Models
Thomas Katraouras, Dimitrios Rafailidis

TL;DR
This paper introduces MBC, a memory bank compression method for continual learning in large language models, significantly reducing memory size while maintaining high accuracy during online updates.
Contribution
MBC employs a codebook optimization and online resetting to compress memory banks in LLMs, enabling efficient continual learning with minimal memory growth.
Findings
Memory bank size reduced to 0.3% of baseline
Maintains high accuracy during online adaptation
Effective in question-answering benchmarks
Abstract
Large Language Models (LLMs) have become a mainstay for many everyday applications. However, as data evolve their knowledge quickly becomes outdated. Continual learning aims to update LLMs with new information without erasing previously acquired knowledge. Although methods such as full fine-tuning can incorporate new data, they are computationally expensive and prone to catastrophic forgetting, where prior knowledge is overwritten. Memory-augmented approaches address this by equipping LLMs with a memory bank, that is an external memory module which stores information for future use. However, these methods face a critical limitation, in particular, the memory bank constantly grows in the real-world scenario when large-scale data streams arrive. In this paper, we propose MBC, a model that compresses the memory bank through a codebook optimization strategy during online adaptation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
