Efficient Backdoor Defense in Multimodal Contrastive Learning: A   Token-Level Unlearning Method for Mitigating Threats

Kuanrong Liu; Siyuan Liang; Jiawei Liang; Pengwen Dai; Xiaochun Cao

arXiv:2409.19526·cs.CR·October 1, 2024

Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats

Kuanrong Liu, Siyuan Liang, Jiawei Liang, Pengwen Dai, Xiaochun Cao

PDF

Open Access

TL;DR

This paper introduces a novel token-level unlearning method for defending multimodal contrastive learning models against backdoor attacks, achieving high effectiveness with minimal impact on clean accuracy.

Contribution

The study proposes a fast, efficient backdoor defense technique using token-based unlearning and suspicious sample detection, improving over existing methods in speed and accuracy.

Findings

01

Significantly reduces attack success rate by 19% compared to SOTA.

02

Maintains high clean accuracy with an increase of 2.57%.

03

Effective against various backdoor attack methods in CLIP models.

Abstract

Multimodal contrastive learning uses various data modalities to create high-quality features, but its reliance on extensive data sources on the Internet makes it vulnerable to backdoor attacks. These attacks insert malicious behaviors during training, which are activated by specific triggers during inference, posing significant security risks. Despite existing countermeasures through fine-tuning that reduce the malicious impacts of such attacks, these defenses frequently necessitate extensive training time and degrade clean accuracy. In this study, we propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities, known as Unlearn Backdoor Threats (UBT). We specifically use overfit training to improve backdoor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection

MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training · Contrastive Learning