Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning
Huiyi Wang, Haodong Lu, Lina Yao, Dong Gong

TL;DR
This paper introduces SEMA, a novel method for continual learning that dynamically expands pre-trained models with modular adapters based on detected distribution shifts, improving knowledge reuse and reducing model growth.
Contribution
SEMA automatically decides when to reuse or add adapters in pre-trained models for continual learning, balancing stability and plasticity effectively.
Findings
Achieves state-of-the-art performance in PTM-based continual learning.
Enables sub-linear model expansion rate.
Improves knowledge reuse without memory rehearsal.
Abstract
Continual learning (CL) aims to continually accumulate knowledge from a non-stationary data stream without catastrophic forgetting of learned knowledge, requiring a balance between stability and adaptability. Relying on the generalizable representation in pre-trained models (PTMs), PTM-based CL methods perform effective continual adaptation on downstream tasks by adding learnable adapters or prompts upon the frozen PTMs. However, many existing PTM-based CL methods use restricted adaptation on a fixed set of these modules to avoid forgetting, suffering from limited CL ability. Periodically adding task-specific modules results in linear model growth rate and impaired knowledge reuse. We propose Self-Expansion of pre-trained models with Modularized Adaptation (SEMA), a novel approach to enhance the control of stability-plasticity balance in PTM-based CL. SEMA automatically decides to reuse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
MethodsSparse Evolutionary Training · Focus · Adapter
