CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

Junzhe Shen; Jieru Zhao; Ziwei He; Zhouhan Lin

arXiv:2603.02547·cs.CL·March 4, 2026

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

Junzhe Shen, Jieru Zhao, Ziwei He, Zhouhan Lin

PDF

Open Access

TL;DR

This paper introduces CoDAR, a novel two-stage framework for continuous diffusion language models that enhances generation quality by addressing token rounding bottlenecks through a context-aware autoregressive decoder.

Contribution

The paper proposes CoDAR, a new approach that maintains continuous diffusion in embedding space and employs a contextually conditioned discretizer, improving performance over existing latent diffusion models.

Findings

01

CoDAR significantly outperforms latent diffusion models in quality.

02

It achieves competitive results with strong discrete diffusion language models.

03

A simple decoder temperature controls fluency and diversity trade-offs.

Abstract

We study why continuous diffusion language models (DLMs) have lagged behind discrete diffusion approaches despite their appealing continuous generative dynamics. Under a controlled token--recovery study, we identify token rounding, the final projection from denoised embeddings to tokens, as a primary bottleneck. Building on these insights, we propose CoDAR (Continuous Diffusion with Contextual AutoRegressive Decoder), a two--stage framework that keeps diffusion entirely continuous in an embedding space while learning a strong, context--conditional discretizer: an autoregressive Transformer decoder that cross--attends to the denoised embedding sequence and performs contextualized rounding to tokens. Experiments on LM1B and OpenWebText demonstrate that CoDAR substantially improves generation quality over latent diffusion and becomes competitive with strong discrete DLMs, while exposing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and cultural evolution · Topic Modeling · Generative Adversarial Networks and Image Synthesis