Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning
Danruo Deng, Guangyong Chen, Jianye Hao, Qiong Wang, Pheng-Ann Heng

TL;DR
This paper proposes a novel continual learning method called FS-DGPM that flattens the loss landscape and adaptively manages task importance to improve learning stability and reduce forgetting.
Contribution
It introduces Flattening Sharpness (FS) and a soft weight mechanism for dynamic basis importance, enhancing continual learning performance.
Findings
Outperforms baseline methods in continual learning tasks.
Effectively reduces catastrophic forgetting.
Improves the generalization of learned skills.
Abstract
The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones. To address such the 'sensitivity-stability' dilemma, most previous efforts have been contributed to minimizing the empirical risk with different parameter regularization terms and episodic memory, but rarely exploring the usages of the weight loss landscape. In this paper, we investigate the relationship between the weight loss landscape and sensitivity-stability in the continual learning scenario, based on which, we propose a novel method, Flattening Sharpness for Dynamic Gradient Projection Memory (FS-DGPM). In particular, we introduce a soft weight to represent the importance of each basis representing past tasks in GPM, which can be adaptively learned during the learning process, so that less important bases can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
