Structured Unrestricted-Rank Matrices for Parameter Efficient   Fine-tuning

Arijit Sehanobish; Avinava Dubey; Krzysztof Choromanski; Somnath Basu; Roy Chowdhury; Deepali Jain; Vikas Sindhwani; Snigdha Chaturvedi

arXiv:2406.17740·cs.LG·December 19, 2024·1 cites

Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning

Arijit Sehanobish, Avinava Dubey, Krzysztof Choromanski, Somnath Basu, Roy Chowdhury, Deepali Jain, Vikas Sindhwani, Snigdha Chaturvedi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Structured Unrestricted-Rank Matrices (SURM), a flexible and efficient parameter-efficient fine-tuning method for Transformers that improves accuracy and reduces parameters compared to existing approaches like LoRA and Adapters.

Contribution

The paper proposes SURM, a novel framework using low displacement rank matrices for PEFT, offering better flexibility and efficiency than existing methods.

Findings

01

SURM achieves 5-7% accuracy improvements on image classification tasks.

02

Up to 12x reduction in parameters in adapters with minimal quality loss.

03

SURM outperforms or matches baselines across multiple benchmarks.

Abstract

Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. In this work, we propose a general framework for parameter efficient fine-tuning (PEFT), based on structured unrestricted-rank matrices (SURM) which can serve as a drop-in replacement for popular approaches such as Adapters and LoRA. Unlike other methods like LoRA, SURMs provides more flexibility in finding the right balance between compactness and expressiveness. This is achieved by using low displacement rank matrices (LDRMs), which hasn't been used in this context before. SURMs remain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arijitthegame/structured-matrices-peft
pytorchOfficial

Videos

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning· slideslive

Taxonomy

TopicsMatrix Theory and Algorithms · Neural Networks and Applications · Electromagnetic Scattering and Analysis

MethodsAttention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer · Absolute Position Encodings