One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
Lin Li, Haoyan Guan, Jianing Qiu, Michael Spratling

TL;DR
This paper introduces Adversarial Prompt Tuning (APT), a novel method to enhance adversarial robustness of vision-language models by learning a robust text prompt, significantly improving performance against adversarial attacks across multiple datasets.
Contribution
The paper proposes APT, a simple yet effective prompt learning approach that boosts adversarial robustness of VLMs with minimal computational cost, outperforming existing methods.
Findings
Adding one learned word to prompts greatly improves robustness and accuracy.
APT outperforms hand-engineered prompts and other methods across datasets.
Significant robustness gains are observed even with limited data settings.
Abstract
Large pre-trained Vision-Language Models (VLMs) like CLIP, despite having remarkable generalization ability, are highly vulnerable to adversarial examples. This work studies the adversarial robustness of VLMs from the novel perspective of the text prompt instead of the extensively studied model weights (frozen in this work). We first show that the effectiveness of both adversarial attack and defense are sensitive to the used text prompt. Inspired by this, we propose a method to improve resilience to adversarial attacks by learning a robust text prompt for VLMs. The proposed method, named Adversarial Prompt Tuning (APT), is effective while being both computationally and data efficient. Extensive experiments are conducted across 15 datasets and 4 data sparsity schemes (from 1-shot to full training data settings) to show APT's superiority over hand-engineered prompts and other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Topic Modeling
MethodsContrastive Language-Image Pre-training
