Prompt Tuning for Discriminative Pre-trained Language Models

Yuan Yao; Bowen Dong; Ao Zhang; Zhengyan Zhang; Ruobing Xie; Zhiyuan; Liu; Leyu Lin; Maosong Sun; Jianyong Wang

arXiv:2205.11166·cs.CL·May 24, 2022

Prompt Tuning for Discriminative Pre-trained Language Models

Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan, Liu, Leyu Lin, Maosong Sun, Jianyong Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces DPT, a novel prompt tuning framework specifically designed for discriminative pre-trained language models, demonstrating improved performance and stability over traditional fine-tuning methods in NLP tasks.

Contribution

It is the first to adapt prompt tuning for discriminative PLMs like ELECTRA, reformulating NLP tasks into discriminative language modeling.

Findings

01

DPT outperforms vanilla fine-tuning in text classification and question answering.

02

DPT enhances stability in tuning large PLMs, especially in low-resource settings.

03

DPT achieves higher performance in both full-set and low-resource scenarios.

Abstract

Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks. However, to the best of our knowledge, existing works focus on prompt-tuning generative PLMs that are pre-trained to generate target tokens, such as BERT. It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned. In this work, we present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem. Comprehensive experiments on text classification and question answering show that, compared with vanilla fine-tuning, DPT achieves significantly higher performance, and also prevents the unstable problem in tuning large PLMs in both full-set and low-resource settings. The source code and experiment details of this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/dpt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSix Ways To Communicate To Someone At Expedia Via Phone And Email's. · Multi-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Dropout · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam