Post-training Quantization on Diffusion Models
Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan

TL;DR
This paper introduces a post-training quantization method tailored for diffusion models, enabling efficient 8-bit models without retraining, thus accelerating generation while preserving or enhancing performance.
Contribution
It develops a novel diffusion model-specific PTQ approach that accounts for multi-time-step distributions, facilitating training-free model compression and acceleration.
Findings
Quantizes full-precision DMs into 8-bit models with maintained or improved performance.
Enables plug-and-play integration with other fast-sampling methods like DDIM.
Achieves acceleration without retraining, suitable for deployment on edge devices.
Abstract
Denoising diffusion (score-based) generative models have recently achieved significant accomplishments in generating realistic and diverse data. These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise. Unfortunately, the generation process of current denoising diffusion models is notoriously slow due to the lengthy iterative noise estimations, which rely on cumbersome neural networks. It prevents the diffusion models from being widely deployed, especially on edge devices. Previous works accelerate the generation process of diffusion model (DM) via finding shorter yet effective sampling trajectories. However, they overlook the cost of noise estimation with a heavy network in every iteration. In this work, we accelerate generation from the perspective of compressing the noise estimation network. Due…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
Methodsfail · Diffusion
