Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than   In-Context Learning

Haokun Liu; Derek Tam; Mohammed Muqeeth; Jay Mohta; Tenghao Huang,; Mohit Bansal; Colin Raffel

arXiv:2205.05638·cs.LG·August 29, 2022·293 cites

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning

Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang,, Mohit Bansal, Colin Raffel

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper demonstrates that parameter-efficient fine-tuning (PEFT) methods outperform in-context learning (ICL) in accuracy and cost, introduces a new PEFT method (IA)$^3$, and presents T-Few, a versatile approach achieving super-human performance on unseen tasks.

Contribution

The paper provides a comprehensive comparison of PEFT and ICL, introduces the IA)$^3$ method, and proposes T-Few for zero-shot task generalization with state-of-the-art results.

Findings

01

PEFT outperforms ICL in accuracy and efficiency.

02

The new IA)$^3$ method improves performance with minimal additional parameters.

03

T-Few achieves super-human performance on unseen tasks, surpassing state-of-the-art.

Abstract

Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efficient fine-tuning (PEFT) (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and PEFT and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs. Along the way, we introduce a new PEFT method called (IA) $^{3}$ that scales activations by learned vectors, attaining stronger performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning· slideslive

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsAdapter