AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation
Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu

TL;DR
AdaDiff introduces a dynamic adaptive computation framework for diffusion models, significantly speeding up image generation by evaluating uncertainty at each step to allocate resources efficiently and decide when to terminate inference.
Contribution
The paper presents a novel adaptive framework with a timestep-aware uncertainty estimation module and an uncertainty-aware loss to enhance diffusion model efficiency.
Findings
Achieves faster image generation with minimal quality loss.
Effectively reduces unnecessary computation during sampling.
Maintains high fidelity in generated images despite adaptive inference.
Abstract
Diffusion models achieve great success in generating diverse and high-fidelity images, yet their widespread application, especially in real-time scenarios, is hampered by their inherently slow generation speed. The slow generation stems from the necessity of multi-step network inference. While some certain predictions benefit from the full computation of the model in each sampling iteration, not every iteration requires the same amount of computation, potentially leading to inefficient computation. Unlike typical adaptive computation challenges that deal with single-step generation problems, diffusion processes with a multi-step generation need to dynamically adjust their computational resource allocation based on the ongoing assessment of each step's importance to the final image output, presenting a unique set of challenges. In this work, we propose AdaDiff, an adaptive framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Early exiting using confidence measures
