The Diffusion Duality

Subham Sekhar Sahoo; Justin Deschenaux; Aaron Gokaslan; Guanghan Wang; Justin Chiu; Volodymyr Kuleshov

arXiv:2506.10892·cs.LG·December 22, 2025

The Diffusion Duality

Subham Sekhar Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper introduces Duo, a diffusion model that leverages Gaussian diffusion techniques to enhance training speed and sampling efficiency, surpassing autoregressive models in some benchmarks for text generation.

Contribution

The paper proposes a novel method, Duo, which transfers Gaussian diffusion techniques to discrete diffusion models, improving training speed and enabling fast sampling in language models.

Findings

01

Models with curriculum learning outperform autoregressive models in zero-shot perplexity.

02

Discreet Consistency Distillation accelerates sampling by two orders of magnitude.

03

Duo achieves competitive results on multiple benchmarks.

Abstract

Uniform-state discrete diffusion models hold the promise of fast text generation due to their inherent ability to self-correct. However, they are typically outperformed by autoregressive models and masked diffusion models. In this work, we narrow this performance gap by leveraging a key insight: Uniform-state diffusion processes naturally emerge from an underlying Gaussian diffusion. Our method, Duo, transfers powerful techniques from Gaussian diffusion to improve both training and sampling. First, we introduce a curriculum learning strategy guided by the Gaussian process, doubling training speed by reducing variance. Models trained with curriculum learning surpass autoregressive models in zero-shot perplexity on 3 of 7 benchmarks. Second, we present Discrete Consistency Distillation, which adapts consistency distillation from the continuous to the discrete setting. This algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

s-sahoo/duo
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Topic Modeling

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings