Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
Maosen Zhao, Pengtao Chen, Chong Yu, Yan Wen, Xudong Tan, Tao Chen

TL;DR
This paper introduces a novel 4-bit floating-point quantization method for diffusion models, using mixup-sign quantization and timestep-aware fine-tuning to improve efficiency and performance over existing integer-based approaches.
Contribution
The paper proposes the MSFP framework with unsigned FP quantization, TALoRA, and DFA, addressing key challenges and achieving superior 4-bit FP quantization performance for diffusion models.
Findings
First to achieve superior 4-bit FP quantization performance.
Outperforms existing PTQ fine-tuning methods in 4-bit INT quantization.
Demonstrates improved memory efficiency and inference speed.
Abstract
Model quantization reduces the bit-width of weights and activations, improving memory efficiency and inference speed in diffusion models. However, achieving 4-bit quantization remains challenging. Existing methods, primarily based on integer quantization and post-training quantization fine-tuning, struggle with inconsistent performance. Inspired by the success of floating-point (FP) quantization in large language models, we explore low-bit FP quantization for diffusion models and identify key challenges: the failure of signed FP quantization to handle asymmetric activation distributions, the insufficient consideration of temporal complexity in the denoising process during fine-tuning, and the misalignment between fine-tuning loss and quantization error. To address these challenges, we propose the mixup-sign floating-point quantization (MSFP) framework, first introducing unsigned FP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Stochastic Gradient Optimization Techniques · Ferroelectric and Negative Capacitance Devices
