The Curious Case of Neural Text Degeneration
Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi

TL;DR
This paper investigates the phenomenon of neural text degeneration, revealing distributional differences between human and machine text, and introduces Nucleus Sampling to improve diversity and quality in neural text generation.
Contribution
The paper uncovers distributional differences between human and machine text and proposes Nucleus Sampling as an effective decoding strategy to enhance diversity and coherence.
Findings
Nucleus Sampling improves diversity in generated text.
Decoding strategies significantly impact text quality.
Distributional differences explain text degeneration phenomena.
Abstract
Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators. The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad range of language understanding tasks, using likelihood as a decoding objective leads to text that is bland and strangely repetitive. In this paper, we reveal surprising distributional differences between human text and machine text. In addition, we find that decoding strategies alone can dramatically effect the quality of machine text, even when generated from exactly the same neural language model. Our findings motivate Nucleus Sampling, a simple but effective method to draw the best out of neural generation. By sampling text from the dynamic nucleus of the probability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
