Consolidator: Mergeable Adapter with Grouped Connections for Visual   Adaptation

Tianxiang Hao; Hui Chen; Yuchen Guo; Guiguang Ding

arXiv:2305.00603·cs.CV·May 2, 2023·1 cites

Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation

Tianxiang Hao, Hui Chen, Yuchen Guo, Guiguang Ding

PDF

Open Access 1 Repo

TL;DR

The paper introduces Consolidator, a parameter-efficient method for adapting vision transformers to downstream tasks by adding a small, tunable module with grouped connections, achieving high accuracy with minimal parameters.

Contribution

It proposes a novel consolidator module using grouped connections for efficient knowledge transfer in vision transformers, outperforming existing parameter-efficient tuning methods.

Findings

01

Achieves up to 7.56% better accuracy than full fine-tuning.

02

Uses only 0.35% of parameters compared to full fine-tuning.

03

Outperforms state-of-the-art parameter-efficient tuning methods.

Abstract

Recently, transformers have shown strong ability as visual feature extractors, surpassing traditional convolution-based models in various scenarios. However, the success of vision transformers largely owes to their capacity to accommodate numerous parameters. As a result, new challenges for adapting large models to downstream tasks arise. On the one hand, classic fine-tuning tunes all parameters in a huge model for every task and thus easily falls into overfitting, leading to inferior performance. On the other hand, on resource-limited devices, fine-tuning stores a full copy of parameters and thus is usually impracticable for the shortage of storage space. However, few works have focused on how to efficiently and effectively transfer knowledge in a vision transformer. Existing methods did not dive into the properties of visual features, leading to inferior performance. Moreover, some of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

beyondhtx/consolidator
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques