GPT Understands, Too

Xiao Liu; Yanan Zheng; Zhengxiao Du; Ming Ding; Yujie Qian; Zhilin; Yang; Jie Tang

arXiv:2103.10385·cs.CL·October 26, 2023

GPT Understands, Too

Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin, Yang, Jie Tang

PDF

5 Repos

TL;DR

This paper introduces P-Tuning, a method using trainable continuous prompts combined with discrete prompts to stabilize and enhance the performance of pretrained language models across various NLU tasks.

Contribution

P-Tuning is a novel approach that employs trainable prompt embeddings to improve stability and performance in natural language understanding tasks.

Findings

01

P-Tuning stabilizes training performance.

02

It improves accuracy on NLU benchmarks like LAMA and SuperGLUE.

03

Effective for both frozen and fine-tuned models.

Abstract

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance -- e.g., changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Cosine Annealing · Adam · Discriminative Fine-Tuning · Attention Is All You Need · Attention Dropout · Byte Pair Encoding · Layer Normalization · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia?