Parameter Efficient Fine-tuning of Self-supervised ViTs without   Catastrophic Forgetting

Reza Akbarian Bafghi; Nidhin Harilal; Claire Monteleoni; Maziar Raissi

arXiv:2404.17245·cs.CV·July 8, 2024

Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting

Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni, Maziar Raissi

PDF

Open Access 1 Repo

TL;DR

This paper introduces parameter-efficient fine-tuning methods for self-supervised vision transformers that significantly reduce catastrophic forgetting and improve adaptation to new domains.

Contribution

It proposes two novel fine-tuning strategies, Block Expansion and LoRA, which outperform full fine-tuning in new domains while maintaining pre-training performance.

Findings

01

Block Expansion and LoRA outperform full fine-tuning in new domains.

02

These methods significantly reduce parameter count needed for adaptation.

03

They mitigate catastrophic forgetting in pre-trained ViTs.

Abstract

Artificial neural networks often suffer from catastrophic forgetting, where learning new concepts leads to a complete loss of previously acquired knowledge. We observe that this issue is particularly magnified in vision transformers (ViTs), where post-pre-training and fine-tuning on new tasks can significantly degrade the model's original general abilities. For instance, a DINO ViT-Base/16 pre-trained on ImageNet-1k loses over 70% accuracy on ImageNet-1k after just 10 iterations of fine-tuning on CIFAR-100. Overcoming this stability-plasticity dilemma is crucial for enabling ViTs to continuously learn and adapt to new domains while preserving their initial knowledge. In this work, we study two new parameter-efficient fine-tuning strategies: (1)~Block Expansion, and (2) Low-rank adaptation (LoRA). Our experiments reveal that using either Block Expansion or LoRA on self-supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rezaakb/peft-vit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Neural Networks and Applications · Blind Source Separation Techniques

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Dense Connections · Residual Connection · Softmax · Vision Transformer · self-DIstillation with NO labels