TL;DR
This paper investigates why beam search produces high-quality language generation despite low MAP optimality, revealing it enforces uniform information density, and proposes new decoding objectives to improve quality.
Contribution
It reframes beam search as solving a different decoding objective, linking it to cognitive science principles and proposing new objectives to enhance language generation.
Findings
Beam search enforces uniform information density in generated text.
Decoding objectives that promote this property improve quality of generated language.
Adherence to uniform information density correlates with BLEU scores in translation.
Abstract
Quite surprisingly, exact maximum a posteriori (MAP) decoding of neural language generators frequently leads to low-quality results. Rather, most state-of-the-art results on language generation tasks are attained using beam search despite its overwhelmingly high search error rate. This implies that the MAP objective alone does not express the properties we desire in text, which merits the question: if beam search is the answer, what was the question? We frame beam search as the exact solution to a different decoding objective in order to gain insights into why high probability under a model alone may not indicate adequacy. We find that beam search enforces uniform information density in text, a property motivated by cognitive science. We suggest a set of decoding objectives that explicitly enforce this property and find that exact decoding with these objectives alleviates the problems…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
