Test-Time Multimodal Backdoor Detection by Contrastive Prompting

Yuwei Niu; Shuo He; Qi Wei; Zongyu Wu; Feng Liu; Lei Feng

arXiv:2405.15269·cs.CV·September 23, 2025

Test-Time Multimodal Backdoor Detection by Contrastive Prompting

Yuwei Niu, Shuo He, Qi Wei, Zongyu Wu, Feng Liu, Lei Feng

PDF

Open Access

TL;DR

This paper introduces BDetCLIP, a novel, efficient test-time method for detecting backdoored images in multimodal contrastive models like CLIP, using contrastive prompting and distribution differences in similarity scores.

Contribution

It is the first to propose a computationally efficient, inference-stage backdoor detection method for CLIP that leverages contrastive prompting and language models.

Findings

01

BDetCLIP outperforms existing methods in effectiveness.

02

It is more efficient with lower computational costs.

03

Successfully detects backdoored images in various scenarios.

Abstract

While multimodal contrastive learning methods (e.g., CLIP) can achieve impressive zero-shot classification performance, recent research has revealed that these methods are vulnerable to backdoor attacks. To defend against backdoor attacks on CLIP, existing defense methods focus on either the pre-training stage or the fine-tuning stage, which would unfortunately cause high computational costs due to numerous parameter updates and are not applicable in black-box settings. In this paper, we provide the first attempt at a computationally efficient backdoor detection method to defend against backdoored CLIP in the \emph{inference} stage. We empirically find that the visual representations of backdoored images are \emph{insensitive} to \emph{benign} and \emph{malignant} changes in class description texts. Motivated by this observation, we propose BDetCLIP, a novel test-time backdoor detection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis

MethodsFocus · Contrastive Learning · Contrastive Language-Image Pre-training