Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo, Niklas Nolte, Brian Karrer

TL;DR
This paper introduces EB-Sampler, an efficient sampling method for masked diffusion models that unmask multiple tokens simultaneously, significantly speeding up sampling without performance loss across various language and reasoning tasks.
Contribution
The paper proposes EB-Sampler, a novel adaptive unmasking algorithm that accelerates sampling from masked diffusion models by 2-3 times while maintaining accuracy.
Findings
Speeds up sampling by 2-3x on coding and math benchmarks.
Effective on reasoning tasks like maze navigation and Sudoku.
Maintains performance despite increased sampling speed.
Abstract
Recent masked diffusion models (MDMs) have shown competitive performance compared to autoregressive models (ARMs) for language modeling. While most literature has focused on performance enhancing sampling procedures, efficient sampling from MDMs has been scarcely explored. We make the observation that often a given sequence of partially masked tokens determines the values of multiple unknown tokens deterministically, meaning that a single prediction of a masked model holds additional information unused by standard sampling procedures. Based on this observation, we introduce EB-Sampler, a simple drop-in replacement for existing samplers, utilizing an Entropy Bounded unmasking procedure that dynamically unmasks multiple tokens in one function evaluation with predefined approximate error tolerance. We formulate the EB-Sampler as part of a broad family of adaptive samplers for which we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Model Reduction and Neural Networks · Numerical methods in inverse problems
MethodsDiffusion
