An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts   on Vision-Language Models

Haochen Luo; Jindong Gu; Fengyuan Liu; Philip Torr

arXiv:2403.09766·cs.CV·March 18, 2024·2 cites

An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts on Vision-Language Models

Haochen Luo, Jindong Gu, Fengyuan Liu, Philip Torr

PDF

Open Access 1 Repo

TL;DR

This paper introduces CroPA, a novel adversarial attack method that enhances the transferability of adversarial images across different prompts in vision-language models, revealing vulnerabilities in prompt-based task adaptation.

Contribution

We propose CroPA, a new attack technique that updates visual adversarial perturbations with learnable prompts to improve cross-prompt transferability in vision-language models.

Findings

01

CroPA significantly improves adversarial transferability across prompts.

02

Vulnerabilities are demonstrated in models like Flamingo, BLIP-2, and InstructBLIP.

03

Cross-prompt attacks can mislead models regardless of prompt variations.

Abstract

Different from traditional task-specific vision models, recent large VLMs can readily adapt to different vision tasks by simply using different textual instructions, i.e., prompts. However, a well-known concern about traditional task-specific vision models is that they can be misled by imperceptible adversarial perturbations. Furthermore, the concern is exacerbated by the phenomenon that the same adversarial perturbations can fool different task-specific models. Given that VLMs rely on prompts to adapt to different tasks, an intriguing question emerges: Can a single adversarial image mislead all predictions of VLMs when a thousand different prompts are given? This question essentially introduces a novel perspective on adversarial transferability: cross-prompt adversarial transferability. In this work, we propose the Cross-Prompt Attack (CroPA). This proposed method updates the visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haochen-luo/cropa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning