CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

Binyan Xu; Fan Yang; Xilin Dai; Di Tang; Kehuan Zhang

arXiv:2507.05113·cs.MM·July 28, 2025

CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation

Binyan Xu, Fan Yang, Xilin Dai, Di Tang, Kehuan Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces CGD, a CLIP-guided backdoor defense method that efficiently detects and neutralizes various backdoor attacks in DNNs by leveraging CLIP's capabilities, achieving near-zero attack success rates with minimal impact on clean accuracy.

Contribution

The paper presents a novel CLIP-guided approach for backdoor defense that is both efficient and effective against diverse attacks, including clean-label and clean-image backdoors.

Findings

01

CGD reduces attack success rates to below 1%.

02

Maintains clean accuracy with less than 0.3% drop.

03

Outperforms existing backdoor defenses across multiple datasets and attack types.

Abstract

Deep Neural Networks (DNNs) are susceptible to backdoor attacks, where adversaries poison training data to implant backdoor into the victim model. Current backdoor defenses on poisoned data often suffer from high computational costs or low effectiveness against advanced attacks like clean-label and clean-image backdoors. To address them, we introduce CLIP-Guided backdoor Defense (CGD), an efficient and effective method that mitigates various backdoor attacks. CGD utilizes a publicly accessible CLIP model to identify inputs that are likely to be clean or poisoned. It then retrains the model with these inputs, using CLIP's logits as a guidance to effectively neutralize the backdoor. Experiments on 4 datasets and 11 attack types demonstrate that CGD reduces attack success rates (ASRs) to below 1% while maintaining clean accuracy (CA) with a maximum drop of only 0.3%, outperforming existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

binyxu/CGD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Network Security and Intrusion Detection

MethodsContrastive Language-Image Pre-training