Adaptation to Intrinsic Dependence in Diffusion Language Models

Yunxiao Zhao; Changxiao Cai

arXiv:2602.20126·cs.LG·February 24, 2026

Adaptation to Intrinsic Dependence in Diffusion Language Models

Yunxiao Zhao, Changxiao Cai

PDF

Open Access

TL;DR

This paper introduces a distribution-agnostic, adaptive unmasking schedule for diffusion language models that improves sampling efficiency by accounting for the data's intrinsic dependence structure, with theoretical convergence guarantees.

Contribution

It proposes a novel randomized unmasking schedule that adapts to data dependence without prior knowledge, providing improved convergence guarantees for diffusion language models.

Findings

01

Convergence rates scale with total correlation measures of data dependence.

02

The method accelerates sampling for low-complexity distributions.

03

Guarantees hold in parallel-sampling regimes, enhancing practical applicability.

Abstract

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive (AR) approaches, enabling parallel token generation beyond a rigid left-to-right order. Despite growing empirical success, the theoretical understanding of how unmasking schedules -- which specify the order and size of unmasked tokens during sampling -- affect generation quality remains limited. In this work, we introduce a distribution-agnostic unmasking schedule for DLMs that adapts to the (unknown) dependence structure of the target data distribution, without requiring any prior knowledge or hyperparameter tuning. In contrast to prior deterministic procedures that fix unmasking sizes, our method randomizes the number of tokens revealed at each iteration. We show that, for two specific parameter choices, the sampling convergence guarantees -- measured by Kullback-Leibler (KL) divergence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques