Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIP

Naman Deep Singh; Francesco Croce; and Matthias Hein

arXiv:2412.00727·cs.LG·April 8, 2026

Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIP

Naman Deep Singh, Francesco Croce, and Matthias Hein

PDF

1 Repo

TL;DR

This paper introduces PAR, a simple and effective fine-tuning method to remove backdoors from CLIP models, outperforming existing techniques and working even with synthetic data.

Contribution

The paper proposes PAR, a novel fine-tuning approach that effectively removes backdoors from CLIP models, addressing limitations of previous methods and demonstrating robustness across various attacks.

Findings

01

PAR achieves high backdoor removal rates.

02

PAR preserves standard model performance.

03

Effective even with synthetic training data.

Abstract

Vision-Language models like CLIP have been shown to be highly effective at linking visual perception and natural language understanding, enabling sophisticated image-text capabilities, including strong retrieval and zero-shot classification performance. Their widespread use, as well as the fact that CLIP models are trained on image-text pairs from the web, make them both a worthwhile and relatively easy target for backdoor attacks. As training foundational models, such as CLIP, from scratch is very expensive, this paper focuses on cleaning potentially poisoned models via fine-tuning. We first show that existing cleaning techniques are not effective against simple structured triggers used in Blended or BadNet backdoor attacks, exposing a critical vulnerability for potential real-world deployment of these models. Then, we introduce PAR, Perturb and Recover, a surprisingly simple yet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nmndeep/PerturbAndRecover
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.