FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation--Full Version
Dat Nguyen-Cong, Tung Kieu, Hoang Thanh-Tung

TL;DR
FastDiSS introduces a training framework for diffusion language models that significantly improves robustness and enables up to 400x faster inference in sequence-to-sequence tasks.
Contribution
It proposes a novel training method that mitigates self-conditioning errors and introduces token-level noise-awareness, enhancing speed and quality of diffusion models.
Findings
Surpasses standard diffusion models in quality.
Achieves up to 400x faster inference.
Remains competitive with one-step diffusion frameworks.
Abstract
Self-conditioning has been central to the success of continuous diffusion language models, as it allows models to correct previous errors. Yet its ability degrades precisely in the regime where diffusion is most attractive for deployment: few-step sampling for fast inference. In this study, we show that when models only have a few denoising steps, inaccurate self-conditioning induces a substantial approximation gap; this mistake compounds across denoising steps and ultimately dominate the sample quality. To address this, we propose a novel training framework that handles these errors during learning by perturbing the self-conditioning signal to match inference noise, improving robustness to prior estimation errors. In addition, we introduce a token-level noise-awareness mechanism that prevents training from saturation, hence improving optimization. Extensive experiments across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
