Exploring Visual Prompts for Adapting Large-Scale Models

Hyojin Bahng; Ali Jahanian; Swami Sankaranarayanan; Phillip Isola

arXiv:2203.17274·cs.CV·June 6, 2022·106 cites

Exploring Visual Prompts for Adapting Large-Scale Models

Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper explores visual prompting as a method to adapt large-scale vision models like CLIP, using a single image perturbation to enable new tasks, showing competitive performance and robustness to distribution shifts.

Contribution

It introduces visual prompting for vision models, demonstrating its effectiveness and robustness, offering a new approach for model adaptation in computer vision.

Findings

01

Visual prompting is effective for CLIP and other models.

02

It is robust to distribution shifts.

03

Performance is competitive with linear probes.

Abstract

We investigate the efficacy of visual prompting to adapt large-scale models in vision. Following the recent approach from prompt tuning and adversarial reprogramming, we learn a single image perturbation such that a frozen model prompted with this perturbation performs a new task. Through comprehensive experiments, we demonstrate that visual prompting is particularly effective for CLIP and robust to distribution shift, achieving performance competitive with standard linear probes. We further analyze properties of the downstream dataset, prompt design, and output transformation in regard to adaptation performance. The surprising effectiveness of visual prompting provides a new perspective on adapting pre-trained models in vision. Code is available at http://hjbahng.github.io/visual_prompting .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hjbahng/visual_prompting
pytorchOfficial

Models

🤗
yahya007/mplug2-vp-for-nriqa
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Cell Image Analysis Techniques · Advanced Vision and Imaging

MethodsContrastive Language-Image Pre-training