Alleviating Representational Shift for Continual Fine-tuning
Shibo Jie, Zhi-Hong Deng, Ziheng Li

TL;DR
This paper introduces ConFiT, a novel fine-tuning approach that mitigates representational shift in continual learning by using cross-convolution batch normalization and hierarchical fine-tuning, leading to improved performance.
Contribution
The paper proposes ConFiT, a new fine-tuning method that addresses both feature and intermediate layer shifts, enhancing continual learning effectiveness.
Findings
Outperforms state-of-the-art methods on four datasets.
Reduces catastrophic forgetting with lower storage overhead.
Effectively maintains feature representations during fine-tuning.
Abstract
We study a practical setting of continual learning: fine-tuning on a pre-trained model continually. Previous work has found that, when training on new tasks, the features (penultimate layer representations) of previous data will change, called representational shift. Besides the shift of features, we reveal that the intermediate layers' representational shift (IRS) also matters since it disrupts batch normalization, which is another crucial cause of catastrophic forgetting. Motivated by this, we propose ConFiT, a fine-tuning method incorporating two components, cross-convolution batch normalization (Xconv BN) and hierarchical fine-tuning. Xconv BN maintains pre-convolution running means instead of post-convolution, and recovers post-convolution ones before testing, which corrects the inaccurate estimates of means under IRS. Hierarchical fine-tuning leverages a multi-stage strategy to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsBatch Normalization
