Q-Diffusion: Quantizing Diffusion Models

Xiuyu Li; Yijiang Liu; Long Lian; Huanrui Yang; Zhen Dong; Daniel; Kang; Shanghang Zhang; Kurt Keutzer

arXiv:2302.04304·cs.CV·June 9, 2023

Q-Diffusion: Quantizing Diffusion Models

Xiuyu Li, Yijiang Liu, Long Lian, Huanrui Yang, Zhen Dong, Daniel, Kang, Shanghang Zhang, Kurt Keutzer

PDF

Open Access 1 Repo

TL;DR

Q-Diffusion introduces a novel post-training quantization method tailored for diffusion models, enabling 4-bit compression with minimal performance loss and faster inference, thus enhancing efficiency for image synthesis tasks.

Contribution

The paper presents a new PTQ technique specifically designed for diffusion models, addressing their unique multi-timestep architecture and activation distributions.

Findings

01

Quantizes diffusion models to 4-bit with minimal FID increase

02

Maintains high generation quality in text-guided image synthesis

03

Achieves training-free quantization with significant efficiency gains

Abstract

Diffusion models have achieved great success in image synthesis through iterative noise estimation using deep neural networks. However, the slow inference, high memory consumption, and computation intensity of the noise estimation model hinder the efficient adoption of diffusion models. Although post-training quantization (PTQ) is considered a go-to compression method for other tasks, it does not work out-of-the-box on diffusion models. We propose a novel PTQ method specifically tailored towards the unique multi-timestep pipeline and model architecture of the diffusion models, which compresses the noise estimation network to accelerate the generation process. We identify the key difficulty of diffusion model quantization as the changing output distributions of noise estimation networks over multiple time steps and the bimodal activation distribution of the shortcut layers within the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Xiuyu-Li/q-diffusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · AI in cancer detection

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings