TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models
Haocheng Huang, Jiaxin Chen, Jinyang Guo, Ruiyi Zhan, Yunhong Wang

TL;DR
This paper introduces TCAQ-DM, a novel adaptive quantization method for diffusion models that reduces inference memory and time costs while maintaining high image quality, by balancing activation ranges and selecting optimal quantizers.
Contribution
The paper proposes TCR, DAQ, and PAR modules that improve post-training quantization for diffusion models, addressing activation distribution variations and input mismatch issues.
Findings
Outperforms state-of-the-art quantization methods on benchmarks.
Achieves comparable FID to full precision models on CIFAR-10.
Enables effective image generation at lower bit-widths.
Abstract
Diffusion models have achieved remarkable success in the image and video generation tasks. Nevertheless, they often require a large amount of memory and time overhead during inference, due to the complex network architecture and considerable number of timesteps for iterative diffusion. Recently, the post-training quantization (PTQ) technique has proved a promising way to reduce the inference cost by quantizing the float-point operations to low-bit ones. However, most of them fail to tackle with the large variations in the distribution of activations across distinct channels and timesteps, as well as the inconsistent of input between quantization and inference on diffusion models, thus leaving much room for improvement. To address the above issues, we propose a novel method dubbed Timestep-Channel Adaptive Quantization for Diffusion Models (TCAQ-DM). Specifically, we develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques
MethodsDiffusion
