BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

Jiawang Bai; Kuofeng Gao; Shaobo Min; Shu-Tao Xia; Zhifeng Li; Wei Liu

arXiv:2311.16194·cs.CV·March 25, 2024·1 cites

BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP

Jiawang Bai, Kuofeng Gao, Shaobo Min, Shu-Tao Xia, Zhifeng Li, Wei Liu

PDF

Open Access

TL;DR

BadCLIP introduces a novel backdoor attack on CLIP models by influencing both image and text encoders through trigger-aware prompts, achieving high success rates with minimal data and strong generalization across datasets.

Contribution

This work presents BadCLIP, the first prompt learning-based backdoor attack on CLIP that influences both encoders, effective with limited data, and demonstrates high success and generalization.

Findings

01

Attack success rate exceeds 99% in most cases.

02

Maintains similar clean accuracy to advanced prompt methods.

03

Effective across multiple datasets and unseen classes.

Abstract

Contrastive Vision-Language Pre-training, known as CLIP, has shown promising effectiveness in addressing downstream image recognition tasks. However, recent works revealed that the CLIP model can be implanted with a downstream-oriented backdoor. On downstream tasks, one victim model performs well on clean samples but predicts a specific target class whenever a specific trigger is present. For injecting a backdoor, existing attacks depend on a large amount of additional data to maliciously fine-tune the entire pre-trained CLIP model, which makes them inapplicable to data-limited scenarios. In this work, motivated by the recent success of learnable prompts, we address this problem by injecting a backdoor into the CLIP model in the prompt learning stage. Our method named BadCLIP is built on a novel and effective mechanism in backdoor attacks on CLIP, i.e., influencing both the image and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · COVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning

MethodsContrastive Language-Image Pre-training