Adversarial Backdoor Defense in CLIP
Junhao Kuang, Siyuan Liang, Jiawei Liang, Kuanrong Liu and, Xiaochun Cao

TL;DR
This paper introduces Adversarial Backdoor Defense (ABD), a novel data augmentation method that enhances CLIP's robustness against backdoor attacks by aligning features with adversarial examples, significantly reducing attack success rates.
Contribution
The paper proposes ABD, a new adversarial data augmentation strategy that effectively defends multimodal models like CLIP against backdoor attacks, outperforming existing methods.
Findings
ABD reduces attack success rates significantly across multiple backdoor attack types.
ABD maintains high clean accuracy with minimal performance loss.
ABD outperforms the state-of-the-art defense method, CleanCLIP.
Abstract
Multimodal contrastive pretraining, exemplified by models like CLIP, has been found to be vulnerable to backdoor attacks. While current backdoor defense methods primarily employ conventional data augmentation to create augmented samples aimed at feature alignment, these methods fail to capture the distinct features of backdoor samples, resulting in suboptimal defense performance. Observations reveal that adversarial examples and backdoor samples exhibit similarities in the feature space within the compromised models. Building on this insight, we propose Adversarial Backdoor Defense (ABD), a novel data augmentation strategy that aligns features with meticulously crafted adversarial examples. This approach effectively disrupts the backdoor association. Our experiments demonstrate that ABD provides robust defense against both traditional uni-modal and multimodal backdoor attacks targeting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutoimmune Neurological Disorders and Treatments · Synthesis and pharmacology of benzodiazepine derivatives
MethodsContrastive Language-Image Pre-training
