Minimal Interaction Separated Tuning: A New Paradigm for Visual Adaptation

Ningyuan Tang; Minghao Fu; Jianxin Wu

arXiv:2406.17559·cs.CV·May 28, 2025·1 cites

Minimal Interaction Separated Tuning: A New Paradigm for Visual Adaptation

Ningyuan Tang, Minghao Fu, Jianxin Wu

PDF

Open Access

TL;DR

This paper introduces MIST, a novel separated tuning method for large vision models that enables efficient low-resource device adaptation by leveraging intermediate features and a lightweight attention-based adaptor.

Contribution

MIST presents a new separated tuning paradigm that reduces information transfer and computational costs while maintaining high adaptation performance.

Findings

01

MIST achieves competitive results on visual adaptation benchmarks.

02

It significantly reduces information transfer overhead.

03

It demonstrates high efficiency in parameters, computation, and memory.

Abstract

The rapid scaling of large vision pretrained models makes fine-tuning tasks more and more difficult on devices with low computational resources. We explore a new visual adaptation paradigm called separated tuning, which treats large pretrained models as standalone feature extractors that run on powerful cloud servers. The fine-tuning carries out on devices which possess only low computational resources (slow CPU, no GPU, small memory, etc.) Existing methods that are potentially suitable for our separated tuning paradigm are discussed. But, three major drawbacks hinder their application in separated tuning: low adaptation capability, large adapter network, and in particular, high information transfer overhead. To address these issues, we propose Minimal Interaction Separated Tuning, or MIST, which reveals that the sum of intermediate features from pretrained models not only has minimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsColor Science and Applications · Image and Video Quality Assessment · Advanced Vision and Imaging

MethodsAdapter