CRoCoDiL: Continuous and Robust Conditioned Diffusion for Language

Roy Uziel; Omer Belhasin; Itay Levy; Akhiad Bercovich; Ran El-Yaniv; Ran Zilberstein; and Michael Elad

arXiv:2603.20210·cs.CL·April 20, 2026

CRoCoDiL: Continuous and Robust Conditioned Diffusion for Language

Roy Uziel, Omer Belhasin, Itay Levy, Akhiad Bercovich, Ran El-Yaniv, Ran Zilberstein, and Michael Elad

PDF

TL;DR

CRoCoDiL introduces a continuous semantic space for diffusion models, enhancing language generation quality and speed by jointly training an encoder-demasker architecture and proposing novel hybrid and multi-diffusion algorithms.

Contribution

It presents a unified fine-tuning approach for diffusion models in language, enabling continuous latent representations and faster, higher-quality text synthesis.

Findings

01

Achieves over 10x faster sampling speeds in unconditional generation.

02

Demonstrates superior generation quality with LLaDA.

03

Introduces two novel diffusion algorithms: ConThenDisc and ConWithinDisc.

Abstract

Masked Diffusion Models (MDMs) provide an efficient non-causal alternative to autoregressive generation but often struggle with token dependencies and semantic incoherence due to their reliance on discrete marginal distributions. We address these limitations by shifting the diffusion process into a continuous sentence-level semantic space. We propose CRoCoDiL (Continuous and Robust Conditioned Diffusion for Language), a unified fine-tuning approach that jointly trains an encoder-demasker architecture, grounding the MDM demasking in continuous latent representations. This leads to the formation of a novel autoencoder in which decoding is obtained by an MDM algorithm. Relying on the same framework, we introduce two unconditional text synthesis algorithms: Continuous-Then-Discrete (ConThenDisc), a hybrid-diffusion approach that first generates latent representations in continuous space and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.