MVP: Multi-task Supervised Pre-training for Natural Language Generation

Tianyi Tang; Junyi Li; Wayne Xin Zhao; Ji-Rong Wen

arXiv:2206.12131·cs.CL·May 30, 2023·6 cites

MVP: Multi-task Supervised Pre-training for Natural Language Generation

Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

PDF

Open Access 4 Repos 10 Models

TL;DR

This paper introduces MVP, a multi-task supervised pre-training approach for natural language generation that unifies diverse datasets into a text-to-text format, leading to state-of-the-art results on multiple NLG tasks.

Contribution

The paper presents a large-scale supervised pre-training method using a unified dataset and soft prompts, improving NLG performance over existing models.

Findings

01

Achieves state-of-the-art results on 13 out of 17 datasets.

02

Outperforms BART by 9.3% and Flan-T5 by 5.8%.

03

Demonstrates the effectiveness of supervised multi-task pre-training.

Abstract

Pre-trained language models (PLMs) have achieved remarkable success in natural language generation (NLG) tasks. Up to now, most NLG-oriented PLMs are pre-trained in an unsupervised manner using the large-scale general corpus. In the meanwhile, an increasing number of models pre-trained with labeled data (i.e. "supervised pre-training") showcase superior performance compared to unsupervised pre-trained models. Motivated by the success of supervised pre-training, we propose Multi-task superVised Pre-training (MVP) for natural language generation. We collect a large-scale natural language generation corpus, MVPCorpus, from $77$ datasets over $11$ diverse NLG tasks. Then we unify these examples into a general text-to-text format to pre-train the text generation model MVP in a supervised manner. For each task, we further pre-train specific soft prompts to stimulate the model's capacity to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Dense Connections · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Residual Connection