Consistent Diffusion Language Models

Hasan Amin; Yuan Gao; Yaser Souri; Subhojit Som; Ming Yin; Rajiv Khanna; Xia Song

arXiv:2605.00161·cs.LG·May 4, 2026

Consistent Diffusion Language Models

Hasan Amin, Yuan Gao, Yaser Souri, Subhojit Som, Ming Yin, Rajiv Khanna, Xia Song

PDF

TL;DR

This paper introduces a novel training framework called Consistent Diffusion Language Model (CDLM) that improves discrete diffusion models for faster, high-quality text generation by ensuring path-invariance across stochastic bridges.

Contribution

It proposes Multi-Path Discrete Consistency (MPDC) and unifies various diffusion approaches into a single, scalable training method for discrete language models.

Findings

01

CDLM achieves state-of-the-art results in text generation tasks.

02

It outperforms existing discrete diffusion models across different sampling budgets.

03

Significant improvements are observed in low-step, fast sampling regimes.

Abstract

Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of refinement steps. In continuous domains, consistency training along the probability-flow ODE is a popular recipe to accelerate diffusion. For discrete diffusion, no analogous sample-space ODE exists, making direct adaptation ill-defined. We argue that the natural discrete substitute is not a deterministic trajectory but its stochastic counterpart: the exact posterior bridge, available in closed form for broad corruption families including masked and uniform diffusion. Building on this observation, we introduce Multi-Path Discrete Consistency (MPDC), a new principle that trains a denoiser to be path-invariant in expectation across these stochastic bridges, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.