Selective Feature Adapter for Dense Vision Transformers
Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang

TL;DR
This paper introduces Selective Feature Adapter (SFA), a novel method for fine-tuning dense vision transformers efficiently by selectively adapting features, achieving state-of-the-art performance with fewer trainable parameters across various vision tasks.
Contribution
The paper proposes a dual adapter approach with external and internal adapters, automatically discovering task-important parameters to optimize parameter efficiency and performance.
Findings
SFA achieves state-of-the-art results under parameter budget constraints.
SFA outperforms other adapter methods on dense vision tasks.
Dual adapters provide the best trade-off between performance and parameter efficiency.
Abstract
Fine-tuning pre-trained transformer models, e.g., Swin Transformer, are successful in numerous downstream for dense prediction vision tasks. However, one major issue is the cost/storage of their huge amount of parameters, which becomes increasingly challenging to handle with the growing amount of vision tasks. In this paper, we propose an effective approach to alleviate the issue, namely selective feature adapter (SFA). It achieves state-of-the-art (SoTA) performance under any given budget of trainable parameters, and demonstrates comparable or better performance than fully fine-tuned models across various dense tasks. Specifically, SFA consists of external adapters and internal adapters which are sequentially operated over a transformer model. For external adapters, we properly select the places and amount of additional multilayer perception (MLP). For internal adapters, we transform a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition
MethodsMulti-Head Attention · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Attention Is All You Need · Adam · Stochastic Depth · Residual Connection · Adapter
