On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers
Thomas De Min, Massimiliano Mancini, Karteek Alahari, Xavier, Alameda-Pineda, Elisa Ricci

TL;DR
This paper proposes tuning LayerNorm parameters for continual learning in Vision Transformers, reducing computational costs while maintaining competitive performance through a novel two-stage training and inference selection method.
Contribution
It introduces a simple yet effective method of learning task-specific LayerNorm parameters for continual learning, improving efficiency without sacrificing accuracy.
Findings
Achieves state-of-the-art or comparable results on ImageNet-R and CIFAR-100.
Reduces computational costs compared to existing rehearsal-free methods.
Demonstrates robustness to incorrect parameter selection during inference.
Abstract
State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting. However, there is a tradeoff between the number of learned parameters and the performance, making such models computationally expensive. In this work, we aim to reduce this cost while maintaining competitive performance. We achieve this by revisiting and extending a simple transfer learning idea: learning task-specific normalization layers. Specifically, we tune the scale and bias parameters of LayerNorm for each continual learning task, selecting them at inference time based on the similarity between task-specific keys and the output of the pre-trained model. To make the classifier robust to incorrect selection of parameters during inference, we introduce a two-stage training procedure, where we first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
