Continuous Diffusion Scales Competitively with Discrete Diffusion for Language
Zhihan Yang, Wei Guo, Shuibai Zhang, Subham Sekhar Sahoo, Yongxin Chen, Arash Vahdat, Morteza Mardani, John Thickstun

TL;DR
This paper demonstrates that likelihood-trained continuous diffusion language models can be scaled competitively with discrete models, achieving state-of-the-art results and providing theoretical insights into their advantages.
Contribution
RePlaid, a likelihood-based continuous diffusion language model, is constructed to rival discrete models in scalability and performance, challenging previous beliefs about continuous diffusion limitations.
Findings
RePlaid achieves a compute gap of only 20x compared to autoregressive models.
RePlaid outperforms Duo with fewer parameters.
RePlaid sets a new state-of-the-art PPL of 22.1 among continuous DLMs on OpenWebText.
Abstract
While diffusion has drawn considerable recent attention from the language modeling community, continuous diffusion has appeared less scalable than discrete approaches. To challenge this belief we revisit Plaid, a likelihood-based continuous diffusion language model (DLM), and construct RePlaid by aligning the architecture of Plaid with modern discrete DLMs. In this unified setting, we establish the first scaling law for continuous DLMs that rivals discrete DLMs: RePlaid exhibits a compute gap of only compared to autoregressive models, outperforms Duo while using fewer parameters, and outperforms MDLM in the over-trained regime. We benchmark RePlaid against recent continuous DLMs: on OpenWebText, RePlaid achieves a new state-of-the-art PPL bound of among continuous DLMs and superior generation quality. These results suggest that continuous diffusion, when trained via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
