Better Safe than Sorry: Pre-training CLIP against Targeted Data   Poisoning and Backdoor Attacks

Wenhan Yang; Jingdong Gao; Baharan Mirzasoleiman

arXiv:2310.05862·cs.LG·June 12, 2024

Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks

Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman

PDF

Open Access 1 Repo

TL;DR

This paper introduces SAFECLIP, a novel pre-training method that significantly enhances CLIP's robustness against targeted data poisoning and backdoor attacks by identifying and isolating risky data during training.

Contribution

SAFECLIP is a new defense approach that uses unimodal contrastive learning and Gaussian Mixture Models to prevent poisoning attacks during CLIP pre-training.

Findings

01

Reduces targeted poisoning attack success rate from 93.75% to 0%.

02

Eliminates backdoor attack success from up to 100% to 0%.

03

Maintains CLIP's original performance on benchmark datasets.

Abstract

Contrastive Language-Image Pre-training (CLIP) on large image-caption datasets has achieved remarkable success in zero-shot classification and enabled transferability to new domains. However, CLIP is extremely more vulnerable to targeted data poisoning and backdoor attacks, compared to supervised learning. Perhaps surprisingly, poisoning 0.0001% of CLIP pre-training data is enough to make targeted data poisoning attacks successful. This is four orders of magnitude smaller than what is required to poison supervised models. Despite this vulnerability, existing methods are very limited in defending CLIP models during pre-training. In this work, we propose a strong defense, SAFECLIP, to safely pre-train CLIP against targeted data poisoning and backdoor attacks. SAFECLIP warms up the model by applying unimodal contrastive learning (CL) on image and text modalities separately. Then, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bigml-cs-ucla/safeclip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsContrastive Language-Image Pre-training · Contrastive Learning