Self-conditioned Embedding Diffusion for Text Generation
Robin Strudel, Corentin Tallec, Florent Altch\'e, Yilun Du, Yaroslav, Ganin, Arthur Mensch, Will Grathwohl, Nikolay Savinov, Sander Dieleman,, Laurent Sifre, R\'emi Leblond

TL;DR
This paper introduces Self-conditioned Embedding Diffusion, a continuous diffusion model operating on token embeddings, achieving comparable text generation quality to autoregressive models with potential hardware efficiency benefits.
Contribution
It presents a novel continuous diffusion mechanism for text that operates on embeddings, enabling scalable and flexible text generation models.
Findings
Generated samples are comparable to autoregressive models.
Diffusion models show potential for hardware efficiency during inference.
The approach paves the way for scaling diffusion models for text.
Abstract
Can continuous diffusion models bring the same performance breakthrough on natural language they did for image generation? To circumvent the discrete nature of text data, we can simply project tokens in a continuous space of embeddings, as is standard in language modeling. We propose Self-conditioned Embedding Diffusion, a continuous diffusion mechanism that operates on token embeddings and allows to learn flexible and scalable diffusion models for both conditional and unconditional text generation. Through qualitative and quantitative evaluation, we show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models - while being in theory more efficient on accelerator hardware at inference time. Our work paves the way for scaling up diffusion models for text, similarly to autoregressive models, and for improving performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsDiffusion
