Efficient Quantization Strategies for Latent Diffusion Models
Yuewei Yang, Xiaoliang Dai, Jialiang Wang, Peizhao Zhang, Hongbo Zhang

TL;DR
This paper introduces a novel quantization strategy for Latent Diffusion Models that enhances their deployability on edge devices by balancing model size reduction with performance preservation, using SQNR as a key metric.
Contribution
It proposes a combined global and local quantization approach tailored for LDMs, addressing their temporal and structural complexities for efficient PTQ.
Findings
Significant reduction in model size with minimal performance loss.
Effective quantization of sensitive and time-sensitive modules.
Improved deployment efficiency on edge devices.
Abstract
Latent Diffusion Models (LDMs) capture the dynamic evolution of latent variables over time, blending patterns and multimodality in a generative system. Despite the proficiency of LDM in various applications, such as text-to-image generation, facilitated by robust text encoders and a variational autoencoder, the critical need to deploy large generative models on edge devices compels a search for more compact yet effective alternatives. Post Training Quantization (PTQ), a method to compress the operational size of deep learning models, encounters challenges when applied to LDM due to temporal and structural complexities. This study proposes a quantization strategy that efficiently quantize LDMs, leveraging Signal-to-Quantization-Noise Ratio (SQNR) as a pivotal metric for evaluation. By treating the quantization discrepancy as relative noise and identifying sensitive part(s) of a model, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
MethodsDiffusion
