Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks

Md. Iqbal Hossain; Afia Sajeeda; Neeresh Kumar Perla; Ming Shao

arXiv:2511.13545·cs.CV·November 18, 2025

Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks

Md. Iqbal Hossain, Afia Sajeeda, Neeresh Kumar Perla, Ming Shao

PDF

Open Access

TL;DR

This paper proposes a novel, efficient method to detect and mitigate backdoor attacks in multimodal contrastive models like CLIP, by identifying triggers and affected labels to fine-tune and restore model robustness.

Contribution

It introduces an innovative strategy that uses an image segmentation oracle and algorithms to identify backdoor triggers and affected labels, enabling targeted fine-tuning of poisoned CLIP models.

Findings

01

Effective detection of backdoor triggers and victim labels.

02

Successful rectification of poisoned CLIP models.

03

Improved robustness demonstrated on visual benchmarks.

Abstract

The advent of multimodal deep learning models, such as CLIP, has unlocked new frontiers in a wide range of applications, from image-text understanding to classification tasks. However, these models are not safe for adversarial attacks, particularly backdoor attacks, which can subtly manipulate model behavior. Moreover, existing defense methods typically involve training from scratch or fine-tuning using a large dataset without pinpointing the specific labels that are affected. In this study, we introduce an innovative strategy to enhance the robustness of multimodal contrastive learning models against such attacks. In particular, given a poisoned CLIP model, our approach can identify the backdoor trigger and pinpoint the victim samples and labels in an efficient manner. To that end, an image segmentation ``oracle'' is introduced as the supervisor for the output of the poisoned CLIP. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Advanced Graph Neural Networks