TL;DR
ELF introduces a novel diffusion model operating in continuous embedding space for language, achieving superior quality with fewer steps by staying continuous until the final token mapping.
Contribution
The paper presents Embedded Language Flows (ELF), a new diffusion approach that effectively adapts continuous models to language by operating in embedding space and enabling techniques like classifier-free guidance.
Findings
ELF outperforms existing discrete and continuous DLMs in quality.
ELF requires fewer sampling steps for high-quality generation.
ELF's approach simplifies adaptation of image diffusion techniques to language.
Abstract
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulation makes it straightforward to adapt established techniques from image-domain diffusion models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
