On the Effectiveness of LayerNorm Tuning for Continual Learning in   Vision Transformers

Thomas De Min; Massimiliano Mancini; Karteek Alahari; Xavier; Alameda-Pineda; Elisa Ricci

arXiv:2308.09610·cs.CV·August 21, 2023

On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers

Thomas De Min, Massimiliano Mancini, Karteek Alahari, Xavier, Alameda-Pineda, Elisa Ricci

PDF

Open Access 1 Repo

TL;DR

This paper proposes tuning LayerNorm parameters for continual learning in Vision Transformers, reducing computational costs while maintaining competitive performance through a novel two-stage training and inference selection method.

Contribution

It introduces a simple yet effective method of learning task-specific LayerNorm parameters for continual learning, improving efficiency without sacrificing accuracy.

Findings

01

Achieves state-of-the-art or comparable results on ImageNet-R and CIFAR-100.

02

Reduces computational costs compared to existing rehearsal-free methods.

03

Demonstrates robustness to incorrect parameter selection during inference.

Abstract

State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting. However, there is a tradeoff between the number of learned parameters and the performance, making such models computationally expensive. In this work, we aim to reduce this cost while maintaining competitive performance. We achieve this by revisiting and extending a simple transfer learning idea: learning task-specific normalization layers. Specifically, we tune the scale and bias parameters of LayerNorm for each continual learning task, selecting them at inference time based on the similarity between task-specific keys and the output of the pre-trained model. To make the classifier robust to incorrect selection of parameters during inference, we introduce a two-stage training procedure, where we first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tdemin16/continual-layernorm-tuning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI