Momentum Decoding: Open-ended Text Generation As Graph Exploration
Tian Lan, Yixuan Su, Shuhang Liu, Heyan Huang, Xian-Ling, Mao

TL;DR
This paper introduces momentum decoding, a novel graph exploration-based method for open-ended text generation with language models, improving inference speed and reducing degeneration compared to traditional maximization-based decoding methods.
Contribution
It formulates text generation as graph exploration and proposes momentum decoding, a new approach that encourages exploration and reduces repetition without extra computational overhead.
Findings
Performs comparably to state-of-the-art methods in quality.
Offers significantly improved inference speed and FLOPs.
Effectively reduces degeneration and repetition in generated texts.
Abstract
Open-ended text generation with autoregressive language models (LMs) is one of the core tasks in natural language processing. However, maximization-based decoding methods (e.g., greedy/beam search) often lead to the degeneration problem, i.e., the generated text is unnatural and contains undesirable repetitions. Existing solutions to this problem either introduce randomness prone to incoherence or require a look-ahead mechanism that demands extra computational overhead. In this study, we formulate open-ended text generation from a new perspective, i.e., we view it as an exploration process within a directed graph. Thereby, we understand the phenomenon of degeneration as circular loops within the directed graph. Based on our formulation, we propose a novel decoding method -- \textit{momentum decoding} -- which encourages the LM to \textit{greedily} explore new nodes outside the current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsTest · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
