Post-training Quantization on Diffusion Models

Yuzhang Shang; Zhihang Yuan; Bin Xie; Bingzhe Wu; Yan Yan

arXiv:2211.15736·cs.CV·March 17, 2023

Post-training Quantization on Diffusion Models

Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a post-training quantization method tailored for diffusion models, enabling efficient 8-bit models without retraining, thus accelerating generation while preserving or enhancing performance.

Contribution

It develops a novel diffusion model-specific PTQ approach that accounts for multi-time-step distributions, facilitating training-free model compression and acceleration.

Findings

01

Quantizes full-precision DMs into 8-bit models with maintained or improved performance.

02

Enables plug-and-play integration with other fast-sampling methods like DDIM.

03

Achieves acceleration without retraining, suitable for deployment on edge devices.

Abstract

Denoising diffusion (score-based) generative models have recently achieved significant accomplishments in generating realistic and diverse data. These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise. Unfortunately, the generation process of current denoising diffusion models is notoriously slow due to the lengthy iterative noise estimations, which rely on cumbersome neural networks. It prevents the diffusion models from being widely deployed, especially on edge devices. Previous works accelerate the generation process of diffusion model (DM) via finding shorter yet effective sampling trajectories. However, they overlook the cost of noise estimation with a heavy network in every iteration. In this work, we accelerate generation from the perspective of compressing the noise estimation network. Due…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

42shawn/ptq4dm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks

Methodsfail · Diffusion