Zero-shot Text Classification With Generative Language Models

Raul Puri; Bryan Catanzaro

arXiv:1912.10165·cs.CL·December 24, 2019·74 cites

Zero-shot Text Classification With Generative Language Models

Raul Puri, Bryan Catanzaro

PDF

Open Access

TL;DR

This paper demonstrates that generative language models can perform zero-shot text classification by using natural language descriptions of tasks, achieving significant accuracy improvements without task-specific training.

Contribution

It introduces a method for zero-shot text classification using natural language prompts with generative models, eliminating the need for multiple classification heads.

Findings

01

Up to 45% accuracy improvement over baselines

02

Effective zero-shot generalization across six datasets

03

Natural language descriptions enable task adaptation

Abstract

This work investigates the use of natural language to enable zero-shot model adaptation to new tasks. We use text and metadata from social commenting platforms as a source for a simple pretraining task. We then provide the language model with natural language descriptions of classification tasks as input and train it to generate the correct answer in natural language via a language modeling objective. This allows the model to generalize to new classification tasks without the need for multiple multitask classification heads. We show the zero-shot performance of these generative language models, trained with weak supervision, on six benchmark text classification datasets from the torchtext library. Despite no access to training data, we achieve up to a 45% absolute improvement in classification accuracy over random or majority class baselines. These results show that natural language can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications