Rapid Switching and Multi-Adapter Fusion via Sparse High Rank Adapters
Kartikeya Bhardwaj, Nilesh Prasad Pandey, Sweta Priyadarshi, Viswanath, Ganapathy, Rafael Esteves, Shreya Kadambi, Shubhankar Borse, Paul Whatmough,, Risheek Garrepalli, Mart Van Baalen, Harris Teague, Markus Nagel

TL;DR
This paper introduces Sparse High Rank Adapters (SHiRA), a method that fine-tunes a small fraction of model weights to enable fast adapter switching, improved multi-adapter fusion, and better performance over existing low-rank methods.
Contribution
SHiRA is a novel sparse adapter approach that fine-tunes only 1-2% of model weights, offering rapid switching, no inference overhead, and enhanced multi-adapter fusion capabilities.
Findings
SHiRA outperforms LoRA in various tasks.
Fine-tuning 1-2% of parameters suffices for many adapter tasks.
SHiRA can be combined with existing methods like DoRA.
Abstract
In this paper, we propose Sparse High Rank Adapters (SHiRA) that directly finetune 1-2% of the base model weights while leaving others unchanged, thus, resulting in a highly sparse adapter. This high sparsity incurs no inference overhead, enables rapid switching directly in the fused mode, and significantly reduces concept-loss during multi-adapter fusion. Our extensive experiments on LVMs and LLMs demonstrate that finetuning merely 1-2% parameters in the base model is sufficient for many adapter tasks and significantly outperforms Low Rank Adaptation (LoRA). We also show that SHiRA is orthogonal to advanced LoRA methods such as DoRA and can be easily combined with existing techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdapter · Balanced Selection
