RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
Xijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng

TL;DR
RoLoRA introduces a rotation-based method to improve weight-activation quantization in LoRA fine-tuned large language models, significantly enhancing accuracy and robustness in low-bit quantization scenarios.
Contribution
It is the first to apply rotation-aware fine-tuning for outlier elimination in weight-activation quantization of LoRA models, improving quantization performance.
Findings
Up to 29.5% accuracy gain on 4-bit quantized LLaMA2-13B.
Consistent improvement in low-bit LoRA convergence.
Enhanced robustness of quantized models across multiple LLMs.
Abstract
Low-Rank Adaptation (LoRA), as a representative Parameter-Efficient Fine-Tuning (PEFT)method, significantly enhances the training efficiency by updating only a small portion of the weights in Large Language Models (LLMs). Recently, weight-only quantization techniques have also been applied to LoRA methods to reduce the memory footprint of fine-tuning. However, applying weight-activation quantization to the LoRA pipeline is under-explored, and we observe substantial performance degradation primarily due to the presence of activation outliers. In this work, we propose RoLoRA, the first LoRA-based scheme for effective weight-activation quantization. RoLoRA utilizes rotation for outlier elimination and proposes rotation-aware fine-tuning to preserve the outlier-free characteristics in rotated LLMs. Experimental results show RoLoRA consistently improves low-bit LoRA convergence and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Analytical Chemistry and Sensors · Advanced MRI Techniques and Applications
