Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang,, Mohit Bansal, Colin Raffel

TL;DR
This paper demonstrates that parameter-efficient fine-tuning (PEFT) methods outperform in-context learning (ICL) in accuracy and cost, introduces a new PEFT method (IA)$^3$, and presents T-Few, a versatile approach achieving super-human performance on unseen tasks.
Contribution
The paper provides a comprehensive comparison of PEFT and ICL, introduces the IA)$^3$ method, and proposes T-Few for zero-shot task generalization with state-of-the-art results.
Findings
PEFT outperforms ICL in accuracy and efficiency.
The new IA)$^3$ method improves performance with minimal additional parameters.
T-Few achieves super-human performance on unseen tasks, surpassing state-of-the-art.
Abstract
Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unseen task without any gradient-based training by feeding a small number of training examples as part of the input. ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made. Parameter-efficient fine-tuning (PEFT) (e.g. adapter modules, prompt tuning, sparse update methods, etc.) offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task. In this paper, we rigorously compare few-shot ICL and PEFT and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs. Along the way, we introduce a new PEFT method called (IA) that scales activations by learned vectors, attaining stronger performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsAdapter
