Is Your Diffusion Sampler Actually Correct? A Sampler-Centric Evaluation of Discrete Diffusion Language Models

Luhan Tang; Longxuan Yu; Shaorong Zhang; Greg Ver Steeg

arXiv:2602.19619·cs.LG·February 24, 2026

Is Your Diffusion Sampler Actually Correct? A Sampler-Centric Evaluation of Discrete Diffusion Language Models

Luhan Tang, Longxuan Yu, Shaorong Zhang, Greg Ver Steeg

PDF

Open Access

TL;DR

This paper introduces a new evaluation framework for discrete diffusion language models that isolates sampler errors, revealing that few-step samplers are often incorrect despite improvements in traditional metrics.

Contribution

It proposes a sampler-centric oracle framework using an exact HMM posterior to evaluate discrete diffusion models, highlighting issues with sampler correctness not captured by existing metrics.

Findings

01

Few-step samplers are not distributionally correct even with an oracle denoiser.

02

Improvements in likelihood or perplexity do not guarantee correct sampling.

03

Transition mismatch decreases only as the number of steps approaches sequence length.

Abstract

Discrete diffusion language models (dLLMs) provide a fast and flexible alternative to autoregressive models (ARMs) via iterative denoising with parallel updates. However, their evaluation is challenging: existing metrics conflate denoiser approximation error with sampler-induced error from the sampling dynamics, a problem that does not arise for ARMs whose autoregressive sampling exactly reflects the learned probability model. We introduce a sampler-centric oracle framework that replaces learned denoisers with an exact Hidden Markov Model posterior derived from a ground-truth Markov chain, isolating sampler-induced error in a controlled setting. We show that few-step discrete diffusion samplers are not distributionally correct even under an oracle denoiser, with transition-level mismatch that vanishes only as the number of steps approaches the sequence length. Moreover, improvements in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Topic Modeling · Generative Adversarial Networks and Image Synthesis