Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning
Arijit Sehanobish, Avinava Dubey, Krzysztof Choromanski, Somnath Basu, Roy Chowdhury, Deepali Jain, Vikas Sindhwani, Snigdha Chaturvedi

TL;DR
This paper introduces Structured Unrestricted-Rank Matrices (SURM), a flexible and efficient parameter-efficient fine-tuning method for Transformers that improves accuracy and reduces parameters compared to existing approaches like LoRA and Adapters.
Contribution
The paper proposes SURM, a novel framework using low displacement rank matrices for PEFT, offering better flexibility and efficiency than existing methods.
Findings
SURM achieves 5-7% accuracy improvements on image classification tasks.
Up to 12x reduction in parameters in adapters with minimal quality loss.
SURM outperforms or matches baselines across multiple benchmarks.
Abstract
Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. In this work, we propose a general framework for parameter efficient fine-tuning (PEFT), based on structured unrestricted-rank matrices (SURM) which can serve as a drop-in replacement for popular approaches such as Adapters and LoRA. Unlike other methods like LoRA, SURMs provides more flexibility in finding the right balance between compactness and expressiveness. This is achieved by using low displacement rank matrices (LDRMs), which hasn't been used in this context before. SURMs remain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMatrix Theory and Algorithms · Neural Networks and Applications · Electromagnetic Scattering and Analysis
MethodsAttention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer · Absolute Position Encodings
