MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
Tianchen Zhao, Xuefei Ning, Tongcheng Fang, Enshu Liu, Guyue Huang,, Zinan Lin, Shengen Yan, Guohao Dai, Yu Wang

TL;DR
MixDQ introduces a memory-efficient mixed-precision quantization framework for few-step text-to-image diffusion models, significantly reducing memory and latency while maintaining image quality and text alignment.
Contribution
The paper proposes a novel mixed-precision quantization method with specialized sensitivity analysis and bit-width allocation for diffusion models, enabling high compression without performance loss.
Findings
Achieves 3-4x reduction in model size and memory cost compared to FP16.
Maintains image quality and text alignment at W8A8 quantization.
Provides 1.45x speedup in inference latency.
Abstract
Diffusion models have achieved significant visual generation quality. However, their significant computational and memory costs pose challenge for their application on resource-constrained mobile devices or even desktop GPUs. Recent few-step diffusion models reduces the inference time by reducing the denoising steps. However, their memory consumptions are still excessive. The Post Training Quantization (PTQ) replaces high bit-width FP representation with low-bit integer values (INT4/8) , which is an effective and efficient technique to reduce the memory cost. However, when applying to few-step diffusion models, existing quantization methods face challenges in preserving both the image quality and text alignment. To address this issue, we propose an mixed-precision quantization framework - MixDQ. Firstly, We design specialized BOS-aware quantization method for highly sensitive text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Image Retrieval and Classification Techniques · Radiomics and Machine Learning in Medical Imaging
MethodsDiffusion
