1%>100%: High-Efficiency Visual Adapter with Complex Linear Projection Optimization
Dongshuo Yin, Xue Yang, Deng-Ping Fan, Shi-Min Hu

TL;DR
This paper introduces CoLin, a low-parameter, high-efficiency adapter for vision models that significantly reduces fine-tuning costs while outperforming traditional methods across various vision tasks.
Contribution
The paper proposes a novel low-rank complex adapter with 1% parameters and a tailored loss to address convergence issues, advancing efficient vision model adaptation.
Findings
CoLin outperforms full fine-tuning and delta-tuning methods.
Achieves high efficiency with only 1% parameters.
Demonstrates effectiveness across multiple vision tasks.
Abstract
Deploying vision foundation models typically relies on efficient adaptation strategies, whereas conventional full fine-tuning suffers from prohibitive costs and low efficiency. While delta-tuning has proven effective in boosting the performance and efficiency of LLMs during adaptation, its advantages cannot be directly transferred to the fine-tuning pipeline of vision foundation models. To push the boundaries of adaptation efficiency for vision tasks, we propose an adapter with Complex Linear Projection Optimization (CoLin). For architecture, we design a novel low-rank complex adapter that introduces only about 1% parameters to the backbone. For efficiency, we theoretically prove that low-rank composite matrices suffer from severe convergence issues during training, and address this challenge with a tailored loss. Extensive experiments on object detection, segmentation, image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
