Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Yoonho Lee; Annie S. Chen; Fahim Tajwar; Ananya Kumar; Huaxiu Yao,; Percy Liang; Chelsea Finn

arXiv:2210.11466·cs.LG·June 7, 2023·47 cites

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Yoonho Lee, Annie S. Chen, Fahim Tajwar, Ananya Kumar, Huaxiu Yao,, Percy Liang, Chelsea Finn

PDF

Open Access 1 Repo

TL;DR

This paper introduces surgical fine-tuning, a selective approach to adapt pre-trained models to distribution shifts, demonstrating its effectiveness across various tasks and providing theoretical insights into its advantages.

Contribution

It proposes surgical fine-tuning, showing that selectively tuning layers can outperform full fine-tuning, with theoretical backing for certain neural network settings.

Findings

01

Selective fine-tuning matches or outperforms standard methods.

02

Effectiveness depends on the type of distribution shift.

03

Theoretical proof for first-layer tuning superiority in idealized models.

Abstract

A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task. This paper shows that in such settings, selectively fine-tuning a subset of layers (which we term surgical fine-tuning) matches or outperforms commonly used fine-tuning approaches. Moreover, the type of distribution shift influences which subset is more effective to tune: for example, for image corruptions, fine-tuning only the first few layers works best. We validate our findings systematically across seven real-world data tasks spanning three types of distribution shifts. Theoretically, we prove that for two-layer neural networks in an idealized setting, first-layer tuning can outperform fine-tuning all layers. Intuitively, fine-tuning more parameters on a small target dataset can cause information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anniesch/surgical-finetuning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare