TL;DR
The paper introduces Speculative Coupled Decoding, a training-free method that significantly accelerates autoregressive visual generation without quality loss, by stabilizing draft token sampling.
Contribution
It proposes a novel, training-free, lossless decoding framework that enhances AR generation speed by coupling draft tokens, with minimal algorithm modification and no additional training.
Findings
Achieves up to 4.2x speedup in image generation
Achieves up to 13.6x speedup in video generation
Maintains generation quality without additional training
Abstract
Autoregressive (AR) modeling has recently emerged as a promising new paradigm in visual generation, but its practical adoption is severely constrained by the slow inference speed of per-token generation, which often requires thousands of steps to produce a single sample. While several Speculative Decoding (SD)-based methods have been proposed to solve this problem by generating multiple tokens in a single forward step, they suffer from limited speedup, degraded quality, or require the training of a draft model. To solve these problems, we propose a new training-free, lossless SD framework, Speculative Coupled Decoding (SCD), by extending the recently proposed Speculative Jacobi Decoding (SJD). While SJD shows strong potential for accelerating AR generation by combining Jacobi iteration and SD, we found that its acceptance rate is still significantly limited due to the instability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
