Anchored Diffusion Language Model

Litu Rout; Constantine Caramanis; Sanjay Shakkottai

arXiv:2505.18456·cs.CL·May 27, 2025

Anchored Diffusion Language Model

Litu Rout, Constantine Caramanis, Sanjay Shakkottai

PDF

Open Access 1 Video

TL;DR

The paper introduces Anchored Diffusion Language Models (ADLMs), which improve text generation quality and likelihood modeling by anchoring important tokens, achieving state-of-the-art results and surpassing autoregressive models in human-like text generation.

Contribution

The paper proposes a novel two-stage anchored diffusion framework that enhances language modeling and generation, with theoretical analysis and broad applicability beyond diffusion models.

Findings

01

Significant perplexity improvements on LM1B and OpenWebText datasets.

02

State-of-the-art zero-shot performance across seven benchmarks.

03

First diffusion model to outperform autoregressive models in human-like text generation.

Abstract

Diffusion Language Models (DLMs) promise parallel generation and bidirectional context, yet they underperform autoregressive (AR) models in both likelihood modeling and generated text quality. We identify that this performance gap arises when important tokens (e.g., key words or low-frequency words that anchor a sentence) are masked early in the forward process, limiting contextual information for accurate reconstruction. To address this, we introduce the Anchored Diffusion Language Model (ADLM), a novel two-stage framework that first predicts distributions over important tokens via an anchor network, and then predicts the likelihoods of missing tokens conditioned on the anchored predictions. ADLM significantly improves test perplexity on LM1B and OpenWebText, achieving up to 25.4% gains over prior DLMs, and narrows the gap with strong AR baselines. It also achieves state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Anchored Diffusion Language Model· slideslive

Taxonomy

TopicsNatural Language Processing Techniques

MethodsDiffusion