DeepCache: Accelerating Diffusion Models for Free
Xinyin Ma, Gongfan Fang, Xinchao Wang

TL;DR
DeepCache is a training-free method that accelerates diffusion models by exploiting temporal redundancy in denoising steps, achieving significant speedups with minimal quality loss.
Contribution
DeepCache introduces a novel architecture-based, training-free approach that reuses features across denoising stages to speed up diffusion models without retraining.
Findings
Achieves 2.3× speedup on Stable Diffusion v1.5 with minimal quality decline.
Attains 4.1× acceleration on LDM-4-G with slight FID increase.
Outperforms existing pruning and distillation methods requiring retraining.
Abstract
Diffusion models have recently gained unprecedented attention in the field of image synthesis due to their remarkable generative capabilities. Notwithstanding their prowess, these models often incur substantial computational costs, primarily attributed to the sequential denoising process and cumbersome model size. Traditional methods for compressing diffusion models typically involve extensive retraining, presenting cost and feasibility challenges. In this paper, we introduce DeepCache, a novel training-free paradigm that accelerates diffusion models from the perspective of model architecture. DeepCache capitalizes on the inherent temporal redundancy observed in the sequential denoising steps of diffusion models, which caches and retrieves features across adjacent denoising stages, thereby curtailing redundant computations. Utilizing the property of the U-Net, we reuse the high-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Cell Image Analysis Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Pruning · Max Pooling · Contrastive Language-Image Pre-training · Diffusion · Concatenated Skip Connection · Convolution · U-Net
