Backdoor Cleansing with Unlabeled Data

Lu Pang; Tao Sun; Haibin Ling; Chao Chen

arXiv:2211.12044·cs.LG·July 4, 2023

Backdoor Cleansing with Unlabeled Data

Lu Pang, Tao Sun, Haibin Ling, Chao Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a label-free backdoor defense method for neural networks that uses layer-wise re-initialization and knowledge distillation, effectively mitigating backdoor threats without requiring clean labeled data.

Contribution

The proposed method uniquely defends against backdoor attacks without relying on labeled clean data, making it practical for real-world scenarios.

Findings

01

Effective backdoor removal comparable to label-dependent methods

02

Maintains normal model performance after cleansing

03

Works well even on out-of-distribution data

Abstract

Due to the increasing computational demand of Deep Neural Networks (DNNs), companies and organizations have begun to outsource the training process. However, the externally trained DNNs can potentially be backdoor attacked. It is crucial to defend against such attacks, i.e., to postprocess a suspicious model so that its backdoor behavior is mitigated while its normal prediction power on clean inputs remain uncompromised. To remove the abnormal backdoor behavior, existing methods mostly rely on additional labeled clean samples. However, such requirement may be unrealistic as the training data are often unavailable to end users. In this paper, we investigate the possibility of circumventing such barrier. We propose a novel defense method that does not require training labels. Through a carefully designed layer-wise weight re-initialization and knowledge distillation, our method can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

luluppang/bcu
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques