AdvCLIP: Downstream-agnostic Adversarial Examples in Multimodal Contrastive Learning
Ziqi Zhou, Shengshan Hu, Minghui Li, Hangtao Zhang, Yechao Zhang, Hai, Jin

TL;DR
AdvCLIP introduces a novel universal adversarial attack framework targeting cross-modal contrastive models like CLIP, demonstrating effective cross-task attacks and highlighting the need for new defenses.
Contribution
This work presents the first downstream-agnostic adversarial attack framework for multimodal contrastive models, utilizing a topology-based generative network to generate universal adversarial patches.
Findings
AdvCLIP achieves high attack success across multiple datasets and tasks.
The attack significantly reduces the similarity scores in feature space.
Existing defenses are insufficient against AdvCLIP, indicating a need for new methods.
Abstract
Multimodal contrastive learning aims to train a general-purpose feature extractor, such as CLIP, on vast amounts of raw, unlabeled paired image-text data. This can greatly benefit various complex downstream tasks, including cross-modal image-text retrieval and image classification. Despite its promising prospect, the security issue of cross-modal pre-trained encoder has not been fully explored yet, especially when the pre-trained encoder is publicly available for commercial use. In this work, we propose AdvCLIP, the first attack framework for generating downstream-agnostic adversarial examples based on cross-modal pre-trained encoders. AdvCLIP aims to construct a universal adversarial patch for a set of natural images that can fool all the downstream tasks inheriting the victim cross-modal pre-trained encoder. To address the challenges of heterogeneity between different modalities and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Geophysical Methods and Applications
MethodsContrastive Learning · Contrastive Language-Image Pre-training
