Stability-Weighted Decoding for Diffusion Language Models

Yue Wu; Jian Huang

arXiv:2604.17068·cs.CL·April 21, 2026

Stability-Weighted Decoding for Diffusion Language Models

Yue Wu, Jian Huang

PDF

TL;DR

This paper introduces Stability-Weighted Decoding (SWD), a novel, training-free method that improves diffusion language model decoding by incorporating temporal stability, leading to more accurate and robust text generation.

Contribution

The paper provides a theoretical link between token instability and mutual information, and proposes SWD, a universal, plug-and-play decoding strategy that enhances diffusion LLM performance.

Findings

01

SWD improves accuracy across code and math benchmarks.

02

SWD maintains performance across different decoding policies.

03

SWD exhibits robustness under various acceleration ratios.

Abstract

Diffusion large language models (dLLMs) enable parallel text generation by iteratively denoising a fully masked sequence, unmasking a subset of masked tokens at each step. Existing decoding strategies rely on static confidence metrics computed at a single denoising step, ignoring temporal history and often leading to premature unmasking of unstable tokens. In this work, we theoretically establish that a token's temporal instability, quantified by the KL divergence between consecutive prediction distributions, provides a strict lower bound on its mutual information with the remaining masked context, indicating that temporally unstable tokens are inherently unsafe to unmask. Based on this insight, we propose Stability-Weighted Decoding (SWD), a training-free, plug-and-play strategy that incorporates temporal stability into token scoring and acts as a universal modulator for arbitrary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.