No Compute Left Behind: Rethinking Reasoning and Sampling with Masked Diffusion Models

Zachary Horvitz; Raghav Singhal; Hao Zou; Carles Domingo-Enrich; Zhou Yu; Rajesh Ranganath; Kathleen McKeown

arXiv:2510.19990·cs.LG·October 24, 2025

No Compute Left Behind: Rethinking Reasoning and Sampling with Masked Diffusion Models

Zachary Horvitz, Raghav Singhal, Hao Zou, Carles Domingo-Enrich, Zhou Yu, Rajesh Ranganath, Kathleen McKeown

PDF

Open Access

TL;DR

This paper introduces reasoning-as-infilling and multi-token entropy decoding for masked diffusion language models, improving reasoning, scoring, and sampling efficiency in tasks like math and coding.

Contribution

It proposes novel inference and training methods for MDLMs, enhancing reasoning, uncertainty estimation, and efficiency over traditional decoding approaches.

Findings

01

Fine-tuning on posterior reasoning traces boosts performance.

02

Reasoning-as-infilling enables scoring intermediate reasoning steps.

03

MED reduces decoding steps by 2.7x while maintaining accuracy.

Abstract

Masked diffusion language models (MDLMs) are trained to in-fill positions in randomly masked sequences, in contrast to next-token prediction models. Discussions around MDLMs focus on two benefits: (1) any-order decoding and 2) multi-token decoding. However, we observe that for math and coding tasks, any-order algorithms often underperform or behave similarly to left-to-right sampling, and standard multi-token decoding significantly degrades performance. At inference time, MDLMs compute the conditional distribution of all masked positions. A natural question is: How can we justify this additional compute when left-to-right one-token-at-a-time decoding is on par with any-order decoding algorithms? First, we propose reasoning-as-infilling. By using MDLMs to infill a reasoning template, we can structure outputs and distinguish between reasoning and answer tokens. In turn, this enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Language and cultural evolution · Natural Language Processing Techniques