On the Role of Discreteness in Diffusion LLMs
Ziqi Jin, Bin Wang, Xiang Lin, Lidong Bing, Aixin Sun

TL;DR
This paper analyzes the challenges of applying diffusion models to language generation due to text's discrete structure, proposing insights for more coherent diffusion language models.
Contribution
It categorizes existing diffusion approaches for language, identifies their limitations, and suggests directions for developing more structure-aware diffusion models.
Findings
Uniform corruption ignores positional information.
Token-wise training misses multi-token dependencies.
Current models reflect structural trade-offs.
Abstract
Diffusion models offer appealing properties for language generation, such as parallel decoding and iterative refinement, but the discrete and highly structured nature of text challenges the direct application of diffusion principles. In this paper, we revisit diffusion language modeling from the view of diffusion process and language modeling, and outline five properties that separate diffusion mechanics from language-specific requirements. We first categorize existing approaches into continuous diffusion in embedding space and discrete diffusion over tokens. We then show that each satisfies only part of the five essential properties and therefore reflects a structural trade-off. Through analyses of recent large diffusion language models, we identify two central issues: (i) uniform corruption does not respect how information is distributed across positions, and (ii) token-wise marginal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Authorship Attribution and Profiling · Natural Language Processing Techniques
