A Cheaper and Better Diffusion Language Model with Soft-Masked Noise
Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang

TL;DR
This paper introduces Masked-Diffuse LM, a novel diffusion model for language that uses soft-masking and categorical prediction to improve efficiency and generation quality over existing diffusion models for text.
Contribution
The paper proposes a linguistically-informed diffusion process with soft-masking and categorical prediction, addressing limitations of Gaussian noise in discrete language modeling.
Findings
Outperforms state-of-the-art diffusion models in text generation quality
Achieves lower training costs compared to existing diffusion-based language models
Demonstrates effectiveness across five controlled generation tasks
Abstract
Diffusion models that are based on iterative denoising have been recently proposed and leveraged in various generation tasks like image generation. Whereas, as a way inherently built for continuous data, existing diffusion models still have some limitations in modeling discrete data, e.g., languages. For example, the generally used Gaussian noise can not handle the discrete corruption well, and the objectives in continuous spaces fail to be stable for textual data in the diffusion process especially when the dimension is high. To alleviate these issues, we introduce a novel diffusion model for language modeling, Masked-Diffuse LM, with lower training cost and better performances, inspired by linguistic features in languages. Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Music and Audio Processing · Natural Language Processing Techniques
Methodsfail · Diffusion
