Revisiting the Robust Generalization of Adversarial Prompt Tuning
Fan Yang, Mingxuan Xia, Sangzhou Xia, Chicheng Ma, Hui Hui

TL;DR
This paper introduces CAPT, an adaptive prompt tuning framework that improves adversarial robustness and generalization of CLIP models across multiple datasets by balancing consistency between clean and adversarial inputs.
Contribution
The paper proposes a novel adaptive consistency-guided adversarial prompt tuning method that enhances multi-modal feature alignment and robustness of pre-trained vision-language models.
Findings
CAPT outperforms existing methods on 14 datasets.
It maintains high accuracy on clean data while improving adversarial robustness.
CAPT shows strong generalization under distribution shifts.
Abstract
Understanding the vulnerability of large-scale pre-trained vision-language models like CLIP against adversarial attacks is key to ensuring zero-shot generalization capacity on various downstream tasks. State-of-the-art defense mechanisms generally adopt prompt learning strategies for adversarial fine-tuning to improve the adversarial robustness of the pre-trained model while keeping the efficiency of adapting to downstream tasks. Such a setup leads to the problem of over-fitting which impedes further improvement of the model's generalization capacity on both clean and adversarial examples. In this work, we propose an adaptive Consistency-guided Adversarial Prompt Tuning (i.e., CAPT) framework that utilizes multi-modal prompt learning to enhance the alignment of image and text features for adversarial examples and leverage the strong generalization of pre-trained CLIP to guide the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security
MethodsContrastive Language-Image Pre-training
