Deferred Commitment Decoding for Diffusion Language Models
Yingte Shu, Yuchuan Tian, Chao Xu, Yunhe Wang, Hanting Chen

TL;DR
This paper introduces Deferred Commitment Decoding (DCD), a new decoding method for diffusion language models that improves accuracy and efficiency by deferring token commitments based on uncertainty, addressing boundary-induced context issues.
Contribution
DCD is a training-free decoding strategy that uses a certainty-aware sliding window to improve diffusion language model decoding quality and efficiency.
Findings
DCD improves generation accuracy by up to 16.5%.
DCD achieves a 1.73% accuracy increase on average.
DCD maintains comparable decoding time to fixed block methods.
Abstract
Diffusion language models (DLMs) have recently emerged as a strong alternative to autoregressive models by enabling parallel text generation. To improve inference efficiency and KV-cache compatibility, prior work commonly adopts block-based diffusion, decoding tokens block by block. However, this paradigm suffers from a structural limitation that we term Boundary-Induced Context Truncation (BICT): undecoded tokens near block boundaries are forced to commit without access to nearby future context, even when such context could substantially reduce uncertainty. This limitation degrades decoding certainty and generation quality, especially for tasks requiring precise reasoning, such as mathematical problem solving and code generation. We propose Deferred Commitment Decoding (DCD), a novel, training-free decoding strategy that mitigates this issue. DCD maintains a certainty-aware sliding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare
