Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Haotong Qin, Xudong Ma, Xingyu Zheng, Xiaoyang Li, Yang Zhang, Shouda, Liu, Jie Luo, Xianglong Liu, Michele Magno

TL;DR
This paper introduces IR-QLoRA, a novel method for quantizing LLMs with LoRA that retains information effectively, significantly improving accuracy with minimal additional computational cost across various models and frameworks.
Contribution
IR-QLoRA combines information calibration quantization and elastic connection techniques to enhance quantized LLM accuracy while maintaining efficiency and versatility.
Findings
Achieves 1.4% accuracy improvement on LLaMA-7B for MMLU at 4-bit quantization.
Significantly improves quantized LLM performance with only 0.31% extra time.
Compatible with multiple quantization frameworks and models.
Abstract
The LoRA-finetuning quantization of LLMs has been extensively studied to obtain accurate yet compact LLMs for deployment on resource-constrained hardware. However, existing methods cause the quantized LLM to severely degrade and even fail to benefit from the finetuning of LoRA. This paper proposes a novel IR-QLoRA for pushing quantized LLMs with LoRA to be highly accurate through information retention. The proposed IR-QLoRA mainly relies on two technologies derived from the perspective of unified information: (1) statistics-based Information Calibration Quantization allows the quantized parameters of LLM to retain original information accurately; (2) finetuning-based Information Elastic Connection makes LoRA utilizes elastic representation transformation with diverse information. Comprehensive experiments show that IR-QLoRA can significantly improve accuracy across LLaMA and LLaMA2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging
