CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization
Yanxia Deng, Aozhong Zhang, Selcuk Gurses, Naigang Wang, Zi Yang, Penghang Yin

TL;DR
This paper introduces CLoQ, a calibration-based initialization method for fine-tuning quantized large language models with LoRA, improving performance especially at ultra low-bit quantization levels.
Contribution
The paper proposes a novel calibration-based initialization strategy for LoRA in quantized LLMs, including a theoretical framework for optimal LoRA component construction.
Findings
CLoQ outperforms existing LoRA methods on multiple tasks.
Effective at ultra low-bit quantization levels.
Ensures strong fine-tuning foundation with minimal calibration data.
Abstract
Fine-tuning large language models (LLMs) using low-rank adaptation (LoRA) has become a highly efficient approach for downstream tasks, particularly in scenarios with limited computational resources. However, applying LoRA techniques to quantized LLMs poses unique challenges due to the reduced representational precision of quantized weights. In this paper, we introduce CLoQ (Calibrated LoRA initialization for Quantized LLMs), a simplistic initialization strategy designed to overcome these challenges. Our approach focuses on minimizing the layer-wise discrepancy between the original LLM and its quantized counterpart with LoRA components during initialization. By leveraging a small calibration dataset, CLoQ quantizes a pre-trained LLM and determines the optimal LoRA components for each layer, ensuring a strong foundation for subsequent fine-tuning. A key contribution of this work is a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications
