Differentiable Prompt Makes Pre-trained Language Models Better Few-shot   Learners

Ningyu Zhang; Luoqiu Li; Xiang Chen; Shumin Deng; Zhen Bi; Chuanqi; Tan; Fei Huang; Huajun Chen

arXiv:2108.13161·cs.CL·January 26, 2023·75 cites

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Ningyu Zhang, Luoqiu Li, Xiang Chen, Shumin Deng, Zhen Bi, Chuanqi, Tan, Fei Huang, Huajun Chen

PDF

Open Access 4 Repos 1 Video

TL;DR

This paper introduces DART, a differentiable prompt method that enhances pre-trained language models' few-shot learning capabilities without prompt engineering, by jointly optimizing prompts and labels through backpropagation.

Contribution

The paper presents DART, a novel, plug-and-play, differentiable prompt approach that improves few-shot learning in pre-trained language models without requiring prompt engineering.

Findings

01

DART improves few-shot performance across standard NLP tasks.

02

The approach is compatible with various pre-trained models.

03

It extends to multiple classification tasks.

Abstract

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications