The Curious Case of Neural Text Degeneration

Ari Holtzman; Jan Buys; Li Du; Maxwell Forbes; Yejin Choi

arXiv:1904.09751·cs.CL·February 18, 2020·1.1k cites

The Curious Case of Neural Text Degeneration

Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi

PDF

Open Access 5 Repos 5 Models

TL;DR

This paper investigates the phenomenon of neural text degeneration, revealing distributional differences between human and machine text, and introduces Nucleus Sampling to improve diversity and quality in neural text generation.

Contribution

The paper uncovers distributional differences between human and machine text and proposes Nucleus Sampling as an effective decoding strategy to enhance diversity and coherence.

Findings

01

Nucleus Sampling improves diversity in generated text.

02

Decoding strategies significantly impact text quality.

03

Distributional differences explain text degeneration phenomena.

Abstract

Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators. The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad range of language understanding tasks, using likelihood as a decoding objective leads to text that is bland and strangely repetitive. In this paper, we reveal surprising distributional differences between human text and machine text. In addition, we find that decoding strategies alone can dramatically effect the quality of machine text, even when generated from exactly the same neural language model. Our findings motivate Nucleus Sampling, a simple but effective method to draw the best out of neural generation. By sampling text from the dynamic nucleus of the probability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification