Lessons and Insights from a Unifying Study of Parameter-Efficient   Fine-Tuning (PEFT) in Visual Recognition

Zheda Mai; Ping Zhang; Cheng-Hao Tu; Hong-You Chen; Li Zhang; Wei-Lun; Chao

arXiv:2409.16434·cs.LG·March 26, 2025

Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition

Zheda Mai, Ping Zhang, Cheng-Hao Tu, Hong-You Chen, Li Zhang, Wei-Lun, Chao

PDF

Open Access 2 Repos

TL;DR

This paper systematically compares various parameter-efficient fine-tuning methods for visual recognition, revealing their comparable accuracy, different error patterns, and potential for ensemble use, while also exploring their robustness and efficiency.

Contribution

It provides a comprehensive empirical study of PEFT methods on Vision Transformers, offering practical insights, a user guide, and new findings on their performance and robustness.

Findings

01

PEFT methods achieve similar accuracy in low-shot tasks when carefully tuned.

02

Different PEFT methods make different mistakes and high-confidence predictions.

03

PEFT is effective beyond low-shot regimes, matching or surpassing full fine-tuning with fewer parameters.

Abstract

Parameter-efficient fine-tuning (PEFT) has attracted significant attention due to the growth of pre-trained model sizes and the need to fine-tune (FT) them for superior downstream performance. Despite a surge in new PEFT methods, a systematic study to understand their performance and suitable application scenarios is lacking, leaving questions like "when to apply PEFT" and "which method to use" largely unanswered, especially in visual recognition. In this paper, we conduct a unifying empirical study of representative PEFT methods with Vision Transformers. We systematically tune their hyperparameters to fairly compare their accuracy on downstream tasks. Our study offers a practical user guide and unveils several new insights. First, if tuned carefully, different PEFT methods achieve similar accuracy in the low-shot benchmark VTAB-1K. This includes simple approaches like FT the bias terms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Face and Expression Recognition · Neural Networks and Applications

MethodsSoftmax · Attention Is All You Need · Contrastive Language-Image Pre-training