ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning
Yilang Zhang, Xiaodong Yang, Yiwei Cai, Georgios B. Giannakis

TL;DR
ScaLoRA introduces a method to enhance low-rank adaptation for large language models by optimally scaling incremental updates, leading to improved efficiency and performance in fine-tuning tasks.
Contribution
The paper proposes a novel approach that accumulates high-rank weight updates through optimal scaling of low-rank matrices, overcoming limitations of traditional LoRA.
Findings
Consistent performance improvements over state-of-the-art LoRA variants.
Faster convergence in fine-tuning large language models.
Effective across diverse NLP tasks including reasoning and problem solving.
Abstract
As large language models (LLMs) continue to scale in size, the computational overhead has become a major bottleneck for task-specific fine-tuning. While low-rank adaptation (LoRA) effectively curtails this cost by confining the weight updates to a low-dimensional subspace, such a restriction can hinder effectiveness and slow convergence. This contribution deals with these limitations by accumulating progressively a high-rank weight update from consecutive low-rank increments. Specifically, the per update optimal low-rank matrix is identified to minimize the loss function and closely approximate full fine-tuning. To endow efficient and seamless optimization without restarting, this optimal choice is formed by appropriately scaling the columns of the original low-rank matrix. Rigorous performance guarantees reveal that the optimal scaling can be found analytically. Extensive numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
