Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu Zhang, Ajay Jaiswal, Lu Yin, Shiwei Liu, Jiawei Zhao, Yuandong, Tian, Zhangyang Wang

TL;DR
Q-GaLore introduces a memory-efficient training method for large language models by combining quantization and adaptive low-rank gradient projection, significantly reducing memory usage while maintaining high performance.
Contribution
It proposes an adaptive, quantized low-rank gradient projection technique that reduces SVD computations and memory footprint in LLM training and fine-tuning.
Findings
Enables training LLaMA-7B on a single GPU with 16 GB memory.
Reduces memory consumption by up to 50% compared to LoRA and GaLore.
Outperforms QLoRA at equivalent memory levels.
Abstract
Training Large Language Models (LLMs) is memory-intensive due to the large number of parameters and associated optimization states. GaLore, a recent method, reduces memory usage by projecting weight gradients into a low-rank subspace without compromising performance. However, GaLore relies on time-consuming Singular Value Decomposition (SVD) operations to identify the subspace, and the frequent subspace updates lead to significant training time overhead. Moreover, GaLore offers minimal improvements in accuracy and efficiency compared to LoRA in more accessible fine-tuning scenarios. To address these limitations, we introduce Q-Galore, a novel approach that substantially reduces memory usage by combining quantization and low-rank projection, surpassing the benefits of GaLore. Our method is based on two key observations: (i) the gradient subspace exhibits diverse properties, with some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGa2O3 and related materials · GaN-based semiconductor devices and materials · Semiconductor materials and devices
