Towards Zero-Label Language Learning
Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao

TL;DR
This paper introduces a zero-label learning framework in NLP that uses synthetic data generated via few-shot prompts with pretrained models, achieving competitive results without human annotations.
Contribution
The paper proposes Unsupervised Data Generation (UDG), a novel method for creating high-quality training data from pretrained models without human labels, enabling zero-label learning and effective data augmentation.
Findings
Achieves comparable or better results than models trained on human-labeled data.
Sets new state-of-the-art on the SuperGLUE benchmark when combined with labeled data.
Demonstrates the effectiveness of synthetic data in training task-specific NLP models.
Abstract
This paper explores zero-label learning in Natural Language Processing (NLP), whereby no human-annotated data is used anywhere during training and models are trained purely on synthetic data. At the core of our framework is a novel approach for better leveraging the powerful pretrained language models. Specifically, inspired by the recent success of few-shot inference on GPT-3, we present a training data creation procedure named Unsupervised Data Generation (UDG), which leverages few-shot prompts to synthesize high-quality training data without real human annotations. Our method enables zero-label learning as we train task-specific models solely on the synthetic data, yet we achieve better or comparable results from strong baseline models trained on human-labeled data. Furthermore, when mixed with labeled data, our approach serves as a highly effective data augmentation procedure,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Cosine Annealing · Dense Connections · Attention Dropout · 15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention
