OP-LoRA: The Blessing of Dimensionality
Piotr Teterwak, Kate Saenko, Bryan A. Plummer, Ser-Nam Lim

TL;DR
OP-LoRA introduces an over-parameterized method for low-rank adapters that accelerates training and improves performance across various tasks without increasing inference costs.
Contribution
The paper proposes a novel over-parameterized reparameterization of low-rank adapters using a separate MLP and learned embeddings, enhancing optimization and convergence.
Findings
Faster convergence on matrix factorization task
Lower final loss in experiments
Significant performance improvements in vision-language and image generation tasks
Abstract
Low-rank adapters enable fine-tuning of large models with only a small number of parameters, thus reducing storage costs and minimizing the risk of catastrophic forgetting. However, they often pose optimization challenges, with poor convergence. To overcome these challenges, we introduce an over-parameterized approach that accelerates training without increasing inference costs. This method reparameterizes low-rank adaptation by employing a separate MLP and learned embedding for each layer. The learned embedding is input to the MLP, which generates the adapter parameters. Such overparamaterization has been shown to implicitly function as an adaptive learning rate and momentum, accelerating optimization. At inference time, the MLP can be discarded, leaving behind a standard low-rank adapter. To study the effect of MLP overparameterization on a small yet difficult proxy task, we implement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdapter
