Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition
Yurong Zhang, Honghao Chen, Xinyu Zhang, Xiangxiang Chu, Li Song

TL;DR
Dyn-Adapter introduces a dynamic, disentangled feature extraction method that significantly reduces FLOPs during inference while maintaining or improving accuracy in visual recognition tasks.
Contribution
It proposes a novel dynamic architecture with multi-level feature disentanglement and bidirectional sparsity to enhance PETL efficiency and effectiveness.
Findings
Reduces FLOPs by 50% during inference.
Maintains or improves recognition accuracy.
Demonstrates effectiveness across diverse datasets and backbones.
Abstract
Parameter-efficient transfer learning (PETL) is a promising task, aiming to adapt the large-scale pre-trained model to downstream tasks with a relatively modest cost. However, current PETL methods struggle in compressing computational complexity and bear a heavy inference burden due to the complete forward process. This paper presents an efficient visual recognition paradigm, called Dynamic Adapter (Dyn-Adapter), that boosts PETL efficiency by subtly disentangling features in multiple levels. Our approach is simple: first, we devise a dynamic architecture with balanced early heads for multi-level feature extraction, along with adaptive training strategy. Second, we introduce a bidirectional sparsity strategy driven by the pursuit of powerful generalization ability. These qualities enable us to fine-tune efficiently and effectively: we reduce FLOPs during inference by 50%, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning
MethodsAdapter
