Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Changxiao Cai; Gen Li

arXiv:2603.22248·cs.LG·March 24, 2026

Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Changxiao Cai, Gen Li

PDF

Open Access

TL;DR

This paper provides the first theoretical analysis of confidence-based decoding in diffusion language models, showing it achieves efficient sampling by adaptively controlling unmasking based on entropy, especially effective for low-entropy data.

Contribution

It introduces a theoretical framework for confidence-based decoding in DLMs and proves its efficiency in terms of expected iterations, adapting to data complexity without tuning.

Findings

01

Achieves $ ilde{O}(H(X_0)/ ext{epsilon})$ expected iterations for $ ext{epsilon}$-accurate sampling.

02

Automatically adapts to data entropy, providing acceleration for low-entropy distributions.

03

Provides a foundation for designing more efficient decoding strategies in diffusion language models.

Abstract

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility introduces a challenge absent in AR models: the \emph{decoding strategy} -- which determines the order and number of tokens generated at each iteration -- critically affects sampling efficiency. Among decoding strategies explored in practice, confidence-based methods, which adaptively select which and how many tokens to unmask based on prediction confidence, have shown strong empirical performance. Despite this success, our theoretical understanding of confidence-based decoding remains limited. In this work, we develop the first theoretical analysis framework for confidence-based decoding in DLMs. We focus on an entropy sum-based strategy that continues…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis