Parameter Importance-Driven Continual Learning for Foundation Models
Lingxiang Wang, Hainan Zhang, Zhiming Zheng

TL;DR
This paper introduces PIECE, a parameter importance-driven method that enables foundation models to learn new domain knowledge efficiently while preserving their general reasoning abilities, avoiding catastrophic forgetting without extra parameters.
Contribution
PIECE is a novel continual learning approach that selectively updates only the most important parameters based on importance estimators, without needing historical data or increasing model size.
Findings
PIECE outperforms traditional methods in maintaining general capabilities.
Selective parameter updates lead to state-of-the-art continual learning results.
The method is effective across multiple language and multimodal models.
Abstract
Domain-specific post-training often causes catastrophic forgetting, making foundation models lose their general reasoning ability and limiting their adaptability to dynamic real-world environments. Preserving general capabilities while acquiring downstream domain knowledge is a central challenge for large language and multimodal models. Traditional continual learning methods, such as regularization, replay and architectural isolation, suffer from poor downstream performance, reliance on inaccessible historical data, or additional parameter overhead. While recent parameter-efficient tuning (PET) methods can alleviate forgetting, their effectiveness strongly depends on the choice of parameters and update strategies. In this paper, we introduce PIECE, a Parameter Importance Estimation-based Continual Enhancement method that preserves general ability while efficiently learning domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
