Prompt Tuning for Discriminative Pre-trained Language Models
Yuan Yao, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan, Liu, Leyu Lin, Maosong Sun, Jianyong Wang

TL;DR
This paper introduces DPT, a novel prompt tuning framework specifically designed for discriminative pre-trained language models, demonstrating improved performance and stability over traditional fine-tuning methods in NLP tasks.
Contribution
It is the first to adapt prompt tuning for discriminative PLMs like ELECTRA, reformulating NLP tasks into discriminative language modeling.
Findings
DPT outperforms vanilla fine-tuning in text classification and question answering.
DPT enhances stability in tuning large PLMs, especially in low-resource settings.
DPT achieves higher performance in both full-set and low-resource scenarios.
Abstract
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks. However, to the best of our knowledge, existing works focus on prompt-tuning generative PLMs that are pre-trained to generate target tokens, such as BERT. It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned. In this work, we present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem. Comprehensive experiments on text classification and question answering show that, compared with vanilla fine-tuning, DPT achieves significantly higher performance, and also prevents the unstable problem in tuning large PLMs in both full-set and low-resource settings. The source code and experiment details of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSix Ways To Communicate To Someone At Expedia Via Phone And Email's. · Multi-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Dropout · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam
