IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models
Hang Guo, Yawei Li, Tao Dai, Shu-Tao Xia, Luca Benini

TL;DR
IntLoRA introduces an efficient method for fine-tuning quantized diffusion models using integer low-rank adaptation, enabling faster training and inference without performance loss by maintaining quantized weights throughout.
Contribution
The paper proposes IntLoRA, a novel approach that allows inference-efficient fine-tuning of quantized diffusion models with integer low-rank parameters, eliminating the need for post-training quantization.
Findings
Achieves significant speedup in training and inference.
Maintains model performance without degradation.
Enables seamless merging of adapted weights into pre-trained models.
Abstract
Fine-tuning pre-trained diffusion models under limited budgets has gained great success. In particular, the recent advances that directly fine-tune the quantized weights using Low-rank Adaptation (LoRA) further reduces training costs. Despite these progress, we point out that existing adaptation recipes are not inference-efficient. Specifically, additional post-training quantization (PTQ) on tuned weights is needed during deployment, which results in noticeable performance drop when the bit-width is low. Based on this observation, we introduce IntLoRA, which adapts quantized diffusion models with integer-type low-rank parameters, to include inference efficiency during tuning. Specifically, IntLoRA enables pre-trained weights to remain quantized during training, facilitating fine-tuning on consumer-level GPUs. During inference, IntLoRA weights can be seamlessly merged into pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Model Reduction and Neural Networks · Seismic Imaging and Inversion Techniques
MethodsDiffusion
