S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation
Ligong Han, Hao Wang, Han Gao, Kai Xu, Akash Srivastava

TL;DR
S2D2 introduces a training-free self-speculative decoding method for block-diffusion language models, enhancing speed and accuracy without extra training or test-time compute by combining diffusion with autoregressive verification.
Contribution
It proposes a novel hybrid decoding framework that uses the pretrained model as both drafter and verifier, improving speed and accuracy in block-diffusion models without additional training.
Findings
Up to 4.7× speedup over autoregressive decoding on SDAR
Up to 1.57× speedup over tuned dynamic decoding baseline
Accuracy improved by up to 4.5 points
Abstract
Block-diffusion language models offer a promising path toward faster-than-autoregressive generation by combining block-wise autoregressive decoding with within-block parallel denoising. However, in the few-step regime needed for practical acceleration, standard confidence-thresholded decoding is often brittle: aggressive thresholds hurt quality, while conservative thresholds require unnecessary denoising steps. Existing approaches that address this issue either require additional training or incur extra test-time compute. We present S2D2, a training-free self-speculative decoding framework for block-diffusion language models. Our key observation is that a block-diffusion model becomes autoregressive when the block size is reduced to one, allowing the same pretrained model to act as both drafter and verifier. S2D2 inserts a speculative verification step into standard block-diffusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Speech Recognition and Synthesis
