Improving Sampling for Masked Diffusion Models via Information Gain

Kaisen Yang; Jayden Teoh; Kaicheng Yang; Yitong Zhang; Alex Lamb

arXiv:2602.18176·cs.CL·March 19, 2026

Improving Sampling for Masked Diffusion Models via Information Gain

Kaisen Yang, Jayden Teoh, Kaicheng Yang, Yitong Zhang, Alex Lamb

PDF

Open Access

TL;DR

This paper introduces the Info-Gain Sampler, a new decoding method for Masked Diffusion Models that considers future uncertainty and information gain, significantly improving performance across various tasks.

Contribution

It proposes a novel decoding framework that balances immediate uncertainty with future information gain, addressing limitations of greedy heuristics in MDMs.

Findings

01

Achieves 3.6% higher accuracy on reasoning tasks

02

Attains 63.1% win-rate in creative writing

03

Reduces cumulative uncertainty from 78.4 to 48.6

Abstract

Masked Diffusion Models (MDMs) offer greater flexibility in decoding order than autoregressive models but require careful planning to achieve high-quality generation. Existing samplers typically adopt greedy heuristics, prioritizing positions with the highest local certainty to decode at each step. Through failure case analysis, we identify a fundamental limitation of this approach: it neglects the downstream impact of current decoding choices on subsequent steps and fails to minimize cumulative uncertainty. In particular, these methods do not fully exploit the non-causal nature of MDMs, which enables evaluating how a decoding decision reshapes token probabilities/uncertainty across all remaining masked positions. To bridge this gap, we propose the Info-Gain Sampler, a principled decoding framework that balances immediate uncertainty with information gain over future masked tokens.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications