Sparse Tuning Enhances Plasticity in PTM-based Continual Learning

Huan Zhang; Shenghua Fan; Shuyu Dong; Yujin Zheng; Dingwen Wang; Fan Lyu

arXiv:2505.19943·cs.LG·November 17, 2025

Sparse Tuning Enhances Plasticity in PTM-based Continual Learning

Huan Zhang, Shenghua Fan, Shuyu Dong, Yujin Zheng, Dingwen Wang, Fan Lyu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MIST, a sparse tuning method that selectively updates a tiny fraction of pre-trained model parameters based on mutual information, significantly improving continual learning performance while preserving pre-trained knowledge.

Contribution

The paper proposes a novel mutual information-guided sparse tuning approach that updates less than 5% of parameters, enhancing adaptability and generalization in continual learning.

Findings

01

MIST improves performance across various benchmarks.

02

Fewer than 0.5% of parameters are updated per step.

03

Integrating MIST with baselines yields significant gains.

Abstract

Continual Learning with Pre-trained Models holds great promise for efficient adaptation across sequential tasks. However, most existing approaches freeze PTMs and rely on auxiliary modules like prompts or adapters, limiting model plasticity and leading to suboptimal generalization when facing significant distribution shifts. While full fine-tuning can improve adaptability, it risks disrupting crucial pre-trained knowledge. In this paper, we propose Mutual Information-guided Sparse Tuning (MIST), a plug-and-play method that selectively updates a small subset of PTM parameters, less than 5%, based on sensitivity to mutual information objectives. MIST enables effective task-specific adaptation while preserving generalization. To further reduce interference, we introduce strong sparsity regularization by randomly dropping gradients during tuning, resulting in fewer than 0.5% of parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhwhu/mist
pytorchOfficial

Videos

Sparse Tuning Enhances Plasticity in PTM-based Continual Learning· underline

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing