Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad, Norouzi

TL;DR
This paper introduces the GAL framework that leverages synthetic unlabeled text generated by language models to improve NLP tasks through knowledge distillation, self-training, and few-shot learning, achieving state-of-the-art results.
Contribution
The paper proposes a unified framework for using synthetic text in various learning paradigms and provides theoretical and empirical analysis of generation strategies.
Findings
GAL improves NLP task performance significantly.
Synthetic unlabeled text is more effective than labeled text for training.
State-of-the-art results on GLUE with 6-layer transformers.
Abstract
This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called ``generate, annotate, and learn (GAL)'' to take advantage of synthetic text within knowledge distillation, self-training, and few-shot learning applications. To generate high-quality task-specific text, we either fine-tune LMs on inputs from the task of interest, or prompt large LMs with few examples. We use the best available classifier to annotate synthetic text with soft pseudo labels for knowledge distillation and self-training, and use LMs to obtain hard labels for few-shot learning. We train new supervised models on the combination of labeled and pseudo-labeled data, which results in significant gains across several applications. We investigate key components of GAL and present theoretical and empirical arguments against the use of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Knowledge Distillation · Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Linear Warmup With Cosine Annealing · Attention Dropout · Dense Connections
