The Primacy of Magnitude in Low-Rank Adaptation
Zicheng Zhang, Haoran Li, Yifeng Zhang, Guoqiang Gong, Jiaxing Wang, Junxing Hu, Pengzhang Liu, Qixia Jiang

TL;DR
This paper reveals that the magnitude of weight updates is key to LoRA's success, introduces LoRAM as an efficient initialization method based on this insight, and demonstrates its effectiveness across various benchmarks.
Contribution
The paper establishes update magnitude as the core factor in LoRA performance, and proposes LoRAM, a new initialization scheme that matches spectral methods without their computational costs.
Findings
LoRAM matches or exceeds spectral initialization performance.
Update magnitude governs convergence and hyperparameter tuning.
Spectral benefits are primarily due to magnitude amplification.
Abstract
Low-Rank Adaptation (LoRA) offers a parameter-efficient paradigm for tuning large models. While recent spectral initialization methods improve convergence and performance over the naive "Noise & Zeros" scheme, their extra computational and storage overhead undermines efficiency. In this paper, we establish update magnitude as the fundamental driver of LoRA performance and propose LoRAM, a magnitude-driven "Basis & Basis" initialization scheme that matches spectral methods without their inefficiencies. Our key contributions are threefold: (i) Magnitude of weight updates determines convergence. We prove low-rank structures intrinsically bound update magnitudes, unifying hyperparameter tuning in learning rate, scaling factor, and initialization as mechanisms to optimize magnitude regulation. (ii) Spectral initialization succeeds via magnitude amplification. We demystify that the presumed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques
