Generating Training Data with Language Models: Towards Zero-Shot   Language Understanding

Yu Meng; Jiaxin Huang; Yu Zhang; Jiawei Han

arXiv:2202.04538·cs.CL·October 13, 2022·80 cites

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a zero-shot learning approach for natural language understanding by generating training data with language models, achieving competitive results on the GLUE benchmark without task-specific data.

Contribution

The authors propose a novel method combining unidirectional and bidirectional PLMs to generate training data for zero-shot NLU, improving performance over existing prompting methods.

Findings

01

Achieves strong zero-shot performance on GLUE tasks

02

Outperforms zero-shot prompting methods significantly

03

Comparable to few-shot approaches with limited data

Abstract

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yumeng5/supergen
pytorchOfficial

Videos

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification