Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal   Contrastive Learning via Local Token Unlearning

Siyuan Liang; Kuanrong Liu; Jiajun Gong; Jiawei Liang; Yuan Xun,; Ee-Chien Chang; and Xiaochun Cao

arXiv:2403.16257·cs.CV·March 26, 2024·2 cites

Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning

Siyuan Liang, Kuanrong Liu, Jiajun Gong, Jiawei Liang, Yuan Xun,, Ee-Chien Chang, and Xiaochun Cao

PDF

Open Access

TL;DR

This paper proposes a novel, low-cost defense method for multimodal contrastive learning models against backdoor attacks by identifying suspicious samples and applying localized token unlearning to remove malicious behaviors while maintaining high accuracy.

Contribution

It introduces a token-based localized unlearning approach that effectively mitigates backdoor threats with minimal impact on clean model performance.

Findings

01

Reduces backdoor attack success rate significantly.

02

Preserves high clean accuracy after unlearning.

03

Requires only a small set of poisoned samples for effective defense.

Abstract

Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features using the complementary strengths of various data modalities. However, the open nature of such systems inadvertently increases the possibility of backdoor attacks. These attacks subtly embed malicious behaviors within the model during training, which can be activated by specific triggers in the inference phase, posing significant security risks. Despite existing countermeasures through fine-tuning that reduce the adverse impacts of such attacks, these defenses often degrade the clean accuracy and necessitate the construction of extensive clean training pairs. In this paper, we explore the possibility of a less-cost defense from the perspective of model unlearning, that is, whether the model can be made to quickly \textbf{u}nlearn \textbf{b}ackdoor \textbf{t}hreats (UBT) by constructing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Speech Recognition and Synthesis · Fire Detection and Safety Systems

MethodsSparse Evolutionary Training · Contrastive Learning