Adversarial Neuron Pruning Purifies Backdoored Deep Models

Dongxian Wu; Yisen Wang

arXiv:2110.14430·cs.LG·October 28, 2021·27 cites

Adversarial Neuron Pruning Purifies Backdoored Deep Models

Dongxian Wu, Yisen Wang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces Adversarial Neuron Pruning, a method that effectively removes backdoors from deep neural networks by pruning sensitive neurons, even with minimal clean data, enhancing security in outsourced training scenarios.

Contribution

The paper presents a novel neuron pruning technique that exploits backdoored DNNs' sensitivity to adversarial perturbations to purify and remove backdoors.

Findings

01

ANP effectively removes backdoors with minimal clean data

02

Backdoored DNNs are more sensitive to adversarial neuron perturbations

03

ANP maintains model performance while purifying backdoors

Abstract

As deep neural networks (DNNs) are growing larger, their requirements for computational resources become huge, which makes outsourcing training more popular. Training in a third-party platform, however, may introduce potential risks that a malicious trainer will return backdoored DNNs, which behave normally on clean samples but output targeted misclassifications whenever a trigger appears at the test time. Without any knowledge of the trigger, it is difficult to distinguish or recover benign DNNs from backdoored ones. In this paper, we first identify an unexpected sensitivity of backdoored DNNs, that is, they are much easier to collapse and tend to predict the target label on clean samples when their neurons are adversarially perturbed. Based on these observations, we propose a novel model repairing method, termed Adversarial Neuron Pruning (ANP), which prunes some sensitive neurons to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Adversarial Neuron Pruning Purifies Backdoored Deep Models· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications

MethodsTest · Pruning