Towards Latent Diffusion Suitable For Text

Nesta Midavaine; Christian A. Naesseth; Grigory Bartosh

arXiv:2601.16220·cs.CL·January 26, 2026

Towards Latent Diffusion Suitable For Text

Nesta Midavaine, Christian A. Naesseth, Grigory Bartosh

PDF

Open Access

TL;DR

This paper introduces Neural Flow Diffusion Models for language generation, which adapt continuous diffusion techniques to discrete language data, improving sampling speed and quality while reducing likelihood gaps compared to autoregressive models.

Contribution

It extends NFDM to enable continuous diffusion in discrete spaces, providing a new approach for efficient and high-quality language modeling.

Findings

01

Reduces likelihood gap with autoregressive models

02

Achieves sample quality comparable to previous latent diffusion models

03

Substantially improves sampling efficiency

Abstract

Language diffusion models aim to improve sampling speed and coherence over autoregressive LLMs. We introduce Neural Flow Diffusion Models for language generation, an extension of NFDM that enables the straightforward application of continuous diffusion models to discrete state spaces. NFDM learns a multivariate forward process from the data, ensuring that the forward process and generative trajectory are a good fit for language modeling. Our model substantially reduces the likelihood gap with autoregressive models of the same size, while achieving sample quality comparable to that of previous latent diffusion models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Language and cultural evolution · Generative Adversarial Networks and Image Synthesis