On the Effectiveness of Parameter-Efficient Fine-Tuning

Zihao Fu; Haoran Yang; Anthony Man-Cho So; Wai Lam; Lidong Bing; Nigel; Collier

arXiv:2211.15583·cs.CL·November 29, 2022·1 cites

On the Effectiveness of Parameter-Efficient Fine-Tuning

Zihao Fu, Haoran Yang, Anthony Man-Cho So, Wai Lam, Lidong Bing, Nigel, Collier

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes parameter-efficient fine-tuning methods, providing a theoretical understanding of their stability and generalization, and introduces a novel Second-order Approximation Method (SAM) to optimize tunable parameters effectively.

Contribution

It categorizes existing sparse fine-tuning methods, offers a theoretical analysis of their regularization effect, and proposes SAM for better parameter selection.

Findings

01

Sparsity acts as regularization, improving stability and generalization.

02

Theoretical analysis explains why sparse fine-tuning outperforms full fine-tuning.

03

SAM outperforms strong baselines in experiments.

Abstract

Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always yields an entirely new model for each task. Currently, many research works propose to only fine-tune a small portion of the parameters while keeping most of the parameters shared across different tasks. These methods achieve surprisingly good performance and are shown to be more stable than their corresponding fully fine-tuned counterparts. However, such kind of methods is still not well understood. Some natural questions arise: How does the parameter sparsity lead to promising performance? Why is the model more stable than the fully fine-tuned models? How to choose the tunable parameters? In this paper, we first categorize the existing methods into random approaches, rule-based approaches, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fuzihaofzh/analyzeparameterefficientfinetune
pytorchOfficial

Videos

On the Effectiveness of Parameter-Efficient Fine-Tuning· underline

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Multimodal Machine Learning Applications