OP-LoRA: The Blessing of Dimensionality

Piotr Teterwak; Kate Saenko; Bryan A. Plummer; Ser-Nam Lim

arXiv:2412.10362·cs.LG·December 16, 2024

OP-LoRA: The Blessing of Dimensionality

Piotr Teterwak, Kate Saenko, Bryan A. Plummer, Ser-Nam Lim

PDF

TL;DR

OP-LoRA introduces an over-parameterized method for low-rank adapters that accelerates training and improves performance across various tasks without increasing inference costs.

Contribution

The paper proposes a novel over-parameterized reparameterization of low-rank adapters using a separate MLP and learned embeddings, enhancing optimization and convergence.

Findings

01

Faster convergence on matrix factorization task

02

Lower final loss in experiments

03

Significant performance improvements in vision-language and image generation tasks

Abstract

Low-rank adapters enable fine-tuning of large models with only a small number of parameters, thus reducing storage costs and minimizing the risk of catastrophic forgetting. However, they often pose optimization challenges, with poor convergence. To overcome these challenges, we introduce an over-parameterized approach that accelerates training without increasing inference costs. This method reparameterizes low-rank adaptation by employing a separate MLP and learned embedding for each layer. The learned embedding is input to the MLP, which generates the adapter parameters. Such overparamaterization has been shown to implicitly function as an adaptive learning rate and momentum, accelerating optimization. At inference time, the MLP can be discarded, leaving behind a standard low-rank adapter. To study the effect of MLP overparameterization on a small yet difficult proxy task, we implement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdapter