FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

Shibo Jie; Zhi-Hong Deng

arXiv:2212.03145·cs.CV·June 13, 2023·6 cites

FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

Shibo Jie, Zhi-Hong Deng

PDF

Open Access 1 Repo 1 Video

TL;DR

FacT introduces a tensorization-decomposition framework for lightweight parameter-efficient transfer learning on vision transformers, achieving high performance with minimal trainable parameters and surpassing existing methods in various settings.

Contribution

The paper proposes a novel tensorization-decomposition approach for parameter-efficient transfer learning, significantly reducing trainable parameters while maintaining or improving performance.

Findings

01

FacT is 5x more parameter-efficient than state-of-the-art PETL methods.

02

A tiny version with only 8K parameters outperforms full fine-tuning.

03

FacT excels in few-shot learning, outperforming all PETL baselines.

Abstract

Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL). Current PETL methods have shown that by tuning only 0.5% of the parameters, ViT can be adapted to downstream tasks with even better performance than full fine-tuning. In this paper, we aim to further promote the efficiency of PETL to meet the extreme storage constraint in real-world applications. To this end, we propose a tensorization-decomposition framework to store the weight increments, in which the weights of each ViT are tensorized into a single 3D tensor, and their increments are then decomposed into lightweight factors. In the fine-tuning process, only the factors need to be updated and stored, termed Factor-Tuning (FacT). On VTAB-1K benchmark, our method performs on par…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jieshibo/petl-vit
pytorchOfficial

Videos

FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer· underline

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · CCD and CMOS Imaging Sensors

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Linear Layer · Dense Connections · Residual Connection · Layer Normalization · Vision Transformer