Effective Quantization for Diffusion Models on CPUs
Hanwen Chang, Haihao Shen, Yiyang Cai, Xinyu Ye, Zhenzhong Xu, Wenhua, Cheng, Kaokao Lv, Weiwei Zhang, Yintong Lu, Heng Guo

TL;DR
This paper introduces a novel quantization approach for diffusion models that combines quantization-aware training and distillation, enabling efficient CPU inference without sacrificing image quality.
Contribution
It presents a new quantization method specifically designed for diffusion models, addressing their sensitivity and maintaining high image quality during efficient CPU inference.
Findings
Quantized diffusion models retain high image quality.
The approach improves inference efficiency on CPUs.
Code is publicly available for reproducibility.
Abstract
Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes. Quantization, a technique employed to compress deep learning models for enhanced efficiency, presents challenges when applied to diffusion models. These models are notably more sensitive to quantization compared to other model types, potentially resulting in a degradation of image quality. In this paper, we introduce a novel approach to quantize the diffusion models by leveraging both quantization-aware training and distillation. Our results show the quantized models can maintain the high image quality while demonstrating the inference efficiency on CPUs. The code is publicly available at: https://github.com/intel/intel-extension-for-transformers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Image Retrieval and Classification Techniques
MethodsDiffusion
