Efficient Modular Learning through Naive LoRA Summation: Leveraging Orthogonality in High-Dimensional Models
Zhanhao Cao, Clement Truong, Andrew Lizarraga

TL;DR
This paper demonstrates that independently trained LoRA modules on different domains can be effectively combined by simple addition, leveraging orthogonality, to achieve near-merged data performance without additional training.
Contribution
It introduces a method for modularly combining LoRA adapters based on orthogonality, enabling efficient multi-domain adaptation with minimal computation.
Findings
Naive LoRA summation performs comparably to merged data fine-tuning.
Adapter orthogonality correlates with minimal interference.
Simple addition of LoRA modules enables rapid multi-domain adaptation.
Abstract
Recent advances in large language models are driven by scale, while parameter-efficient fine-tuning (PEFT) enables updating only a small fraction of parameters. Low-Rank Adaptation (LoRA) stores parameter deltas as the product of two small matrices, which makes them natural building blocks that can be composed. Motivated by the superposition principle, we hypothesize that independently trained LoRA modules on disjoint domains are approximately orthogonal and can be combined by simple addition. Using GPT-2 Small (117M) with LoRA rank 4 and alpha=64, we train adapters for three QA domains (math, medicine, finance). In pairwise tests, adding Math+Medicine adapters improves perplexity by -9.10% relative to merged-data fine-tuning, while Math+Finance and Finance+Medicine change by +4.54% and +27.56%, respectively. Across combinations, the RMS cosine similarity between LoRA deltas correlates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning
