Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
Xisen Jin, Bill Yuchen Lin, Mohammad Rostami, Xiang Ren

TL;DR
This paper introduces CLIF, a unified continual learning setup for NLP that enables models to accumulate knowledge over sequential tasks, improving rapid generalization to new tasks while mitigating forgetting.
Contribution
The paper proposes CLIF, a new framework combining continual learning and few-shot learning, along with a novel regularized adapter generation method for NLP models.
Findings
Continual learning improves generalization to new NLP tasks.
Catastrophic forgetting impacts less on generalization than on seen task performance.
Continual learning algorithms provide significant benefits in knowledge accumulation.
Abstract
The ability to continuously expand knowledge over time and utilize it to rapidly generalize to new tasks is a key feature of human linguistic intelligence. Existing models that pursue rapid generalization to new tasks (e.g., few-shot learning methods), however, are mostly trained in a single shot on fixed datasets, unable to dynamically expand their knowledge; while continual learning algorithms are not specifically designed for rapid generalization. We present a new learning setup, Continual Learning of Few-Shot Learners (CLIF), to address the challenges of both learning settings in a unified setup. CLIF assumes a model learns from a sequence of diverse NLP tasks arriving sequentially, accumulating knowledge for improved generalization to new tasks, while also retaining performance on the tasks learned earlier. We examine how the generalization ability is affected in the continual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
MethodsAdapter
