Parallel Sampling via Autospeculation
Nima Anari, Carlo Baronio, CJ Chen, Alireza Haqi, Frederic Koehler, Anqi Li, Thuy-Duong Vuong

TL;DR
This paper introduces parallel algorithms for sampling in autoregressive and diffusion models, reducing expected sampling time to roughly the square root of sequence length using a novel autospeculation technique.
Contribution
It presents the first parallel algorithms achieving sublinear expected sampling time for these models, with a new autospeculation method inspired by speculative decoding.
Findings
Expected sampling time reduced to n^{1/2}
First parallel speedup for diffusion models in high-accuracy regime
Introduces autospeculation technique for accelerated sampling
Abstract
We present parallel algorithms to accelerate sampling via counting in two settings: any-order autoregressive models and denoising diffusion models. An any-order autoregressive model accesses a target distribution on through an oracle that provides conditional marginals, while a denoising diffusion model accesses a target distribution on through an oracle that provides conditional means under Gaussian noise. Standard sequential sampling algorithms require time to produce a sample from in either setting. We show that, by issuing oracle calls in parallel, the expected sampling time can be reduced to . This improves the previous bound for any-order autoregressive models and yields the first parallel speedup for diffusion models in the high-accuracy regime, under the relatively mild…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
