OT-VP: Optimal Transport-guided Visual Prompting for Test-Time   Adaptation

Yunbei Zhang; Akshay Mehra; Jihun Hamm

arXiv:2407.09498·cs.CV·September 11, 2024

OT-VP: Optimal Transport-guided Visual Prompting for Test-Time Adaptation

Yunbei Zhang, Akshay Mehra, Jihun Hamm

PDF

Open Access 1 Repo

TL;DR

OT-VP introduces a test-time prompt learning method guided by optimal transport to adapt Vision Transformers to new domains without retraining or modifying the original model, achieving state-of-the-art results efficiently.

Contribution

This work presents OT-VP, a novel test-time visual prompting approach using optimal transport to align source and target domains without retraining or model modification.

Findings

01

Outperforms state-of-the-art on multiple datasets

02

Uses only four learned prompt tokens

03

Operates efficiently in memory and computation

Abstract

Vision Transformers (ViTs) have demonstrated remarkable capabilities in learning representations, but their performance is compromised when applied to unseen domains. Previous methods either engage in prompt learning during the training phase or modify model parameters at test time through entropy minimization. The former often overlooks unlabeled target data, while the latter doesn't fully address domain shifts. In this work, our approach, Optimal Transport-guided Test-Time Visual Prompting (OT-VP), handles these problems by leveraging prompt learning at test time to align the target and source domains without accessing the training process or altering pre-trained model parameters. This method involves learning a universal visual prompt for the target domain by optimizing the Optimal Transport distance.OT-VP, with only four learned prompt tokens, exceeds state-of-the-art performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zybeich/ot-vp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Gaze Tracking and Assistive Technology · Elevator Systems and Control

MethodsALIGN