Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
Wei Dong, Yuan Sun, Yiting Yang, Xing Zhang, Zhijun Lin, Qingsen Yan,, Haokui Zhang, Peng Wang, Yang Yang, Hengtao Shen

TL;DR
This paper introduces a flexible parameter-efficient fine-tuning method for Vision Transformers using Householder transformations to adapt the rank of the model's adaptation matrices across layers, improving flexibility and performance.
Contribution
The paper proposes a novel PEFT approach using Householder transformations inspired by SVD, allowing layer-wise variation in adaptation matrix ranks for better model adaptation.
Findings
Achieves promising fine-tuning performance on standard vision tasks.
Enables layer-wise variation in adaptation matrix ranks.
Outperforms fixed-rank methods like LoRA and Adapter.
Abstract
A common strategy for Parameter-Efficient Fine-Tuning (PEFT) of pre-trained Vision Transformers (ViTs) involves adapting the model to downstream tasks by learning a low-rank adaptation matrix. This matrix is decomposed into a product of down-projection and up-projection matrices, with the bottleneck dimensionality being crucial for reducing the number of learnable parameters, as exemplified by prevalent methods like LoRA and Adapter. However, these low-rank strategies typically employ a fixed bottleneck dimensionality, which limits their flexibility in handling layer-wise variations. To address this limitation, we propose a novel PEFT approach inspired by Singular Value Decomposition (SVD) for representing the adaptation matrix. SVD decomposes a matrix into the product of a left unitary matrix, a diagonal matrix of scaling values, and a right unitary matrix. We utilize Householder…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · CCD and CMOS Imaging Sensors · Image Processing Techniques and Applications
MethodsAdapter
