COLA: Continual Learning via Autoencoder Retrieval of Adapters
Jaya Krishna Mandivarapu

TL;DR
COLA introduces a novel autoencoder-based framework for continual learning in language models, enabling efficient knowledge transfer and mitigating catastrophic forgetting without extensive data replay or large task-specific parameters.
Contribution
The paper proposes COLA, a new method that uses autoencoders to capture task-specific weights, allowing continual learning with minimal data and parameter overhead, outperforming existing methods.
Findings
Reduces catastrophic forgetting in LLMs.
Achieves significant parameter and memory efficiency.
Outperforms state-of-the-art continual learning methods.
Abstract
Learning a set of tasks over time, also known as continual learning (CL), is one of the most challenging problems in artificial intelligence due to catastrophic forgetting. Large language models (LLMs) are often impractical to frequent re-training and continual learning , due to high cost of computational resources for training. Moreover, LLM are not suitable for continual learning as updating these models over time for acquiring new knowledge leads to overwrites existing knowledge leading to common phenomenon know as \textit{catastrophic forgetting}. In this paper, we aim to address these concerns using a novel framework , COLA that employs an autoencoder to learn capture low-dimensional embeddings of the weights associated with various tasks. Our approach facilitates the transfer of knowledge to new tasks while preventing catastrophic forgetting, all without using data replay or a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
