Investigating Label Bias in Beam Search for Open-ended Text Generation
Liang Wang, Jinlong Liu, Jingming Liu

TL;DR
This paper investigates how label bias affects beam search in open-ended text generation and proposes training methods to reduce bias, resulting in more diverse and meaningful outputs.
Contribution
It demonstrates that label bias significantly impacts beam search quality and introduces training techniques to mitigate bias without increasing perplexity.
Findings
Reduced label bias improves diversity of generated texts.
Combining globally normalized training with maximum likelihood reduces degeneracy.
Empirical evidence shows enhanced human and automatic evaluation scores.
Abstract
Beam search is an effective and widely used decoding algorithm in many sequence-to-sequence (seq2seq) text generation tasks. However, in open-ended text generation, beam search is often found to produce repetitive and generic texts, sampling-based decoding algorithms like top-k sampling and nucleus sampling are more preferred. Standard seq2seq models suffer from label bias due to its locally normalized probability formulation. This paper provides a series of empirical evidence that label bias is a major reason for such degenerate behaviors of beam search. By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity. To quantitatively measure label bias, we test the model's ability to discriminate the groundtruth text and a set of context-agnostic distractors. We conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
