Visual Prompt Tuning for Test-time Domain Adaptation

Yunhe Gao; Xingjian Shi; Yi Zhu; Hao Wang; Zhiqiang Tang; Xiong Zhou,; Mu Li; Dimitris N. Metaxas

arXiv:2210.04831·cs.CV·December 2, 2022·22 cites

Visual Prompt Tuning for Test-time Domain Adaptation

Yunhe Gao, Xingjian Shi, Yi Zhu, Hao Wang, Zhiqiang Tang, Xiong Zhou,, Mu Li, Dimitris N. Metaxas

PDF

Open Access

TL;DR

This paper introduces Data-efficient Prompt Tuning (DePT), a simple and effective method for test-time domain adaptation that uses visual prompts in vision Transformers, achieving state-of-the-art results with high data efficiency and versatility.

Contribution

DePT is a novel prompt-based approach that adapts vision Transformers to new domains efficiently without source data, using memory bank pseudo-labeling and hierarchical self-supervised regularization.

Findings

01

Achieves state-of-the-art performance on VisDA-C, ImageNet-C, and DomainNet-126.

02

Demonstrates high data efficiency with only 1 ext{ or }10 ext{ extbackslash}% data.

03

Extends effectively to online and multi-source TTA settings.

Abstract

Models should be able to adapt to unseen data during test-time to avoid performance drops caused by inevitable distribution shifts in real-world deployment scenarios. In this work, we tackle the practical yet challenging test-time adaptation (TTA) problem, where a model adapts to the target domain without accessing the source data. We propose a simple recipe called \textit{Data-efficient Prompt Tuning} (DePT) with two key ingredients. First, DePT plugs visual prompts into the vision Transformer and only tunes these source-initialized prompts during adaptation. We find such parameter-efficient finetuning can efficiently adapt the model representation to the target domain without overfitting to the noise in the learning objective. Second, DePT bootstraps the source representation to the target domain by memory bank-based online pseudo-labeling. A hierarchical self-supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Dropout · Softmax · Label Smoothing · Adam · Byte Pair Encoding · Absolute Position Encodings