Introspective Diffusion Language Models

Yifan Yu; Yuqing Jian; Junxiong Wang; Zhongzhu Zhou; Donglin Zhuang; Xinyu Fang; Sri Yanamandra; Xiaoxia Wu; Qingyang Wu; Shuaiwen Leon Song; Tri Dao; Ben Athiwaratkun; James Zou; Fan Lai; Chenfeng Xu

arXiv:2604.11035·cs.AI·April 14, 2026

Introspective Diffusion Language Models

Yifan Yu, Yuqing Jian, Junxiong Wang, Zhongzhu Zhou, Donglin Zhuang, Xinyu Fang, Sri Yanamandra, Xiaoxia Wu, Qingyang Wu, Shuaiwen Leon Song, Tri Dao, Ben Athiwaratkun, James Zou, Fan Lai, Chenfeng Xu

PDF

1 Repo 3 Models

TL;DR

This paper introduces Introspective Diffusion Language Models (I-DLMs), which combine diffusion decoding with introspective consistency to match autoregressive model quality and improve efficiency.

Contribution

The paper presents I-DLM, a novel paradigm with an introspective strided decoding algorithm that achieves AR-level quality and higher serving efficiency for diffusion language models.

Findings

01

I-DLM matches the quality of same-scale autoregressive models.

02

I-DLM outperforms prior diffusion models on 15 benchmarks.

03

I-DLM delivers about 3x higher throughput than previous DLMs.

Abstract

Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do not. We define the introspective acceptance rate, which measures whether a model accepts its previously generated tokens. This reveals why AR training has a structural advantage: causal masking and logit shifting implicitly enforce introspective consistency. Motivated by this observation, we introduce Introspective Diffusion Language Model (I-DLM), a paradigm that retains diffusion-style parallel decoding while inheriting the introspective consistency of AR training. I-DLM uses a novel introspective strided decoding (ISD) algorithm, which enables the model to verify previously generated tokens while advancing new ones in the same forward pass. From a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

introspective-diffusion/I-DLM
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.