LoRA vs Full Fine-tuning: An Illusion of Equivalence

Reece Shuttleworth; Jacob Andreas; Antonio Torralba; Pratyusha Sharma

arXiv:2410.21228·cs.LG·October 23, 2025·3 cites

LoRA vs Full Fine-tuning: An Illusion of Equivalence

Reece Shuttleworth, Jacob Andreas, Antonio Torralba, Pratyusha Sharma

PDF

Open Access 10 Models

TL;DR

This paper compares LoRA and full fine-tuning of large language models, revealing that LoRA introduces unique high-rank singular vectors called intruder dimensions which affect model forgetting and performance.

Contribution

It demonstrates that LoRA and full fine-tuning produce fundamentally different spectral structures in weight matrices, especially regarding intruder dimensions and their impact on forgetting.

Findings

01

LoRA introduces high-rank intruder dimensions not present in full fine-tuning.

02

Intruder dimensions are causally linked to model forgetting.

03

Scaling down intruder dimensions improves pre-training distribution modeling with minimal performance loss.

Abstract

Fine-tuning is a crucial paradigm for adapting pre-trained large language models to downstream tasks. Recently, methods like Low-Rank Adaptation (LoRA) have been shown to effectively fine-tune LLMs with an extreme reduction in trainable parameters. But, \emph{are their learned solutions really equivalent?} We study how LoRA and full-finetuning change pre-trained models by analyzing the model's weight matrices through the lens of their spectral properties. We find that LoRA and full fine-tuning yield weight matrices whose singular value decompositions exhibit very different structure: weight matrices trained with LoRA have new, high-ranking singular vectors, which we call \emph{intruder dimensions}, while those trained with full fine-tuning do not. Further, we extend the finding that LoRA forgets less than full fine-tuning and find its forgetting is vastly localized to the intruder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmbedded Systems Design Techniques · Fault Detection and Control Systems · Parallel Computing and Optimization Techniques