Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion
Shuaiting Li, Juncan Deng, Zeyu Wang, Kedong Xu, Rongtao Deng, Hong, Gu, Haibin Shen, Kejie Huang

TL;DR
This paper introduces a novel quantization framework for Stable Diffusion models that balances efficiency and high-fidelity image generation, ensuring consistency with floating-point models for real-time applications.
Contribution
The authors propose a Serial-to-Parallel quantization pipeline with techniques like multi-timestep activation quantization and inter-layer distillation to improve efficiency and output fidelity.
Findings
Outperforms state-of-the-art quantization methods in quality and efficiency.
Achieves high-fidelity image generation with shorter training times.
Maintains consistency between quantized and floating-point models across multiple SD variants.
Abstract
Text-to-image generation via Stable Diffusion models (SDM) have demonstrated remarkable capabilities. However, their computational intensity, particularly in the iterative denoising process, hinders real-time deployment in latency-sensitive applications. While Recent studies have explored post-training quantization (PTQ) and quantization-aware training (QAT) methods to compress Diffusion models, existing methods often overlook the consistency between results generated by quantized models and those from floating-point models. This consistency is paramount for professional applications where both efficiency and output reliability are essential. To ensure that quantized SDM generates high-quality and consistent images, we propose an efficient quantization framework for SDM. Our framework introduces a Serial-to-Parallel pipeline that simultaneously maintains training-inference consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsDiffusion
