Continued Pretraining for Better Zero- and Few-Shot Promptability

Zhaofeng Wu; Robert L. Logan IV; Pete Walsh; Akshita Bhagia; Dirk; Groeneveld; Sameer Singh; Iz Beltagy

arXiv:2210.10258·cs.CL·October 24, 2022·1 cites

Continued Pretraining for Better Zero- and Few-Shot Promptability

Zhaofeng Wu, Robert L. Logan IV, Pete Walsh, Akshita Bhagia, Dirk, Groeneveld, Sameer Singh, Iz Beltagy

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper explores how continued pretraining can enhance language models' ability to perform zero- and few-shot tasks with prompts, showing that a simple prompt-including pretraining approach improves promptability significantly.

Contribution

It introduces a straightforward continued pretraining method with trainable prompts that outperforms existing techniques in zero- and few-shot settings.

Findings

01

Continued pretraining with trainable prompts improves promptability by up to 31%.

02

MAML-style meta-learning underperforms for promptability enhancement.

03

Recommendations are provided for optimizing promptability across use cases.

Abstract

Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

allenai/better-promptability
pytorchOfficial

Models

🤗
ZhaofengWu/better-promptability
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications