Towards a Unified View of Parameter-Efficient Transfer Learning

Junxian He; Chunting Zhou; Xuezhe Ma; Taylor Berg-Kirkpatrick; Graham; Neubig

arXiv:2110.04366·cs.CL·February 3, 2022·280 cites

Towards a Unified View of Parameter-Efficient Transfer Learning

Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham, Neubig

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a unified framework for parameter-efficient transfer learning in NLP, clarifying the relationships among methods and enabling the development of more effective, less parameter-intensive fine-tuning approaches.

Contribution

It introduces a unified view that models various parameter-efficient methods as modifications to hidden states, facilitating understanding and transfer of design principles across approaches.

Findings

01

Unified framework clarifies connections among methods

02

New methods tune fewer parameters with comparable performance

03

Empirical validation across multiple NLP tasks

Abstract

Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP. However, conventional approaches fine-tune all the parameters of the pre-trained model, which becomes prohibitive as the model size and the number of tasks grow. Recent work has proposed a variety of parameter-efficient transfer learning methods that only fine-tune a small number of (extra) parameters to attain strong performance. While effective, the critical ingredients for success and the connections among the various methods are poorly understood. In this paper, we break down the design of state-of-the-art parameter-efficient transfer learning methods and present a unified framework that establishes connections between them. Specifically, we re-frame them as modifications to specific hidden states in pre-trained models, and define a set of design dimensions along which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jxhe/unify-parameter-efficient-tuning
jaxOfficial

Videos

Towards a Unified View of Parameter-Efficient Transfer Learning· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications