Is Your Diffusion Sampler Actually Correct? A Sampler-Centric Evaluation of Discrete Diffusion Language Models
Luhan Tang, Longxuan Yu, Shaorong Zhang, Greg Ver Steeg

TL;DR
This paper introduces a new evaluation framework for discrete diffusion language models that isolates sampler errors, revealing that few-step samplers are often incorrect despite improvements in traditional metrics.
Contribution
It proposes a sampler-centric oracle framework using an exact HMM posterior to evaluate discrete diffusion models, highlighting issues with sampler correctness not captured by existing metrics.
Findings
Few-step samplers are not distributionally correct even with an oracle denoiser.
Improvements in likelihood or perplexity do not guarantee correct sampling.
Transition mismatch decreases only as the number of steps approaches sequence length.
Abstract
Discrete diffusion language models (dLLMs) provide a fast and flexible alternative to autoregressive models (ARMs) via iterative denoising with parallel updates. However, their evaluation is challenging: existing metrics conflate denoiser approximation error with sampler-induced error from the sampling dynamics, a problem that does not arise for ARMs whose autoregressive sampling exactly reflects the learned probability model. We introduce a sampler-centric oracle framework that replaces learned denoisers with an exact Hidden Markov Model posterior derived from a ground-truth Markov chain, isolating sampler-induced error in a controlled setting. We show that few-step discrete diffusion samplers are not distributionally correct even under an oracle denoiser, with transition-level mismatch that vanishes only as the number of steps approaches the sequence length. Moreover, improvements in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Topic Modeling · Generative Adversarial Networks and Image Synthesis
