Text-to-Text Pre-Training for Data-to-Text Tasks

Mihir Kale; Abhinav Rastogi

arXiv:2005.10433·cs.CL·July 12, 2021·31 cites

Text-to-Text Pre-Training for Data-to-Text Tasks

Mihir Kale, Abhinav Rastogi

PDF

Open Access 2 Repos 1 Datasets

TL;DR

This paper demonstrates that text-to-text pre-training with T5 significantly improves data-to-text generation, outperforming other models and enhancing generalization, especially on out-of-domain data.

Contribution

The study introduces T5-based pre-training for data-to-text tasks, showing it surpasses previous architectures and language models in performance and generalization.

Findings

01

T5 pre-training outperforms pipelined neural architectures.

02

T5 achieves better out-of-domain generalization.

03

Pre-training enhances transfer learning for data-to-text tasks.

Abstract

We study the pre-train + fine-tune strategy for data-to-text tasks. Our experiments indicate that text-to-text pre-training in the form of T5, enables simple, end-to-end transformer based models to outperform pipelined neural architectures tailored for data-to-text generation, as well as alternative language model based pre-training techniques such as BERT and GPT-2. Importantly, T5 pre-training leads to better generalization, as evidenced by large improvements on out-of-domain test sets. We hope our work serves as a useful baseline for future research, as transfer learning becomes ever more prevalent for data-to-text tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

GEM/totto
dataset· 386 dl
386 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsLinear Layer · Gated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Softmax · Inverse Square Root Schedule · SentencePiece · Dense Connections · Layer Normalization · Attention Is All You Need