Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with   Language Models

Robert L. Logan IV; Ivana Bala\v{z}evi\'c; Eric Wallace; Fabio; Petroni; Sameer Singh; Sebastian Riedel

arXiv:2106.13353·cs.CL·July 2, 2021·5 cites

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Robert L. Logan IV, Ivana Bala\v{z}evi\'c, Eric Wallace, Fabio, Petroni, Sameer Singh, Sebastian Riedel

PDF

Open Access 2 Repos

TL;DR

This paper demonstrates that finetuning language models, especially bias-only finetuning, can replace prompt engineering in few-shot learning, achieving competitive accuracy with minimal parameter updates and improved robustness.

Contribution

It introduces bias-only finetuning as an efficient alternative to prompt engineering for few-shot learning with language models, reducing parameter updates significantly.

Findings

01

Bias-only finetuning achieves comparable accuracy to full finetuning.

02

Null prompts with finetuning outperform manually-tuned prompts.

03

Finetuning only bias terms updates 0.1% of parameters, reducing memory overhead.

Abstract

Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced: finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, robust to different prompts, and can be made nearly as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques