Self-Expansion of Pre-trained Models with Mixture of Adapters for   Continual Learning

Huiyi Wang; Haodong Lu; Lina Yao; Dong Gong

arXiv:2403.18886·cs.LG·March 28, 2025·1 cites

Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning

Huiyi Wang, Haodong Lu, Lina Yao, Dong Gong

PDF

Open Access 1 Repo

TL;DR

This paper introduces SEMA, a novel method for continual learning that dynamically expands pre-trained models with modular adapters based on detected distribution shifts, improving knowledge reuse and reducing model growth.

Contribution

SEMA automatically decides when to reuse or add adapters in pre-trained models for continual learning, balancing stability and plasticity effectively.

Findings

01

Achieves state-of-the-art performance in PTM-based continual learning.

02

Enables sub-linear model expansion rate.

03

Improves knowledge reuse without memory rehearsal.

Abstract

Continual learning (CL) aims to continually accumulate knowledge from a non-stationary data stream without catastrophic forgetting of learned knowledge, requiring a balance between stability and adaptability. Relying on the generalizable representation in pre-trained models (PTMs), PTM-based CL methods perform effective continual adaptation on downstream tasks by adding learnable adapters or prompts upon the frozen PTMs. However, many existing PTM-based CL methods use restricted adaptation on a fixed set of these modules to avoid forgetting, suffering from limited CL ability. Periodically adding task-specific modules results in linear model growth rate and impaired knowledge reuse. We propose Self-Expansion of pre-trained models with Modularized Adaptation (SEMA), a novel approach to enhance the control of stability-plasticity balance in PTM-based CL. SEMA automatically decides to reuse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huiyiwang01/sema-cl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

MethodsSparse Evolutionary Training · Focus · Adapter