TL;DR
This paper introduces Introspective Diffusion Language Models (I-DLMs), which combine diffusion decoding with introspective consistency to match autoregressive model quality and improve efficiency.
Contribution
The paper presents I-DLM, a novel paradigm with an introspective strided decoding algorithm that achieves AR-level quality and higher serving efficiency for diffusion language models.
Findings
I-DLM matches the quality of same-scale autoregressive models.
I-DLM outperforms prior diffusion models on 15 benchmarks.
I-DLM delivers about 3x higher throughput than previous DLMs.
Abstract
Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with their own generations, while DLMs often do not. We define the introspective acceptance rate, which measures whether a model accepts its previously generated tokens. This reveals why AR training has a structural advantage: causal masking and logit shifting implicitly enforce introspective consistency. Motivated by this observation, we introduce Introspective Diffusion Language Model (I-DLM), a paradigm that retains diffusion-style parallel decoding while inheriting the introspective consistency of AR training. I-DLM uses a novel introspective strided decoding (ISD) algorithm, which enables the model to verify previously generated tokens while advancing new ones in the same forward pass. From a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
