Defending against Backdoor Attack on Deep Neural Networks
Hao Cheng, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, Xue Lin

TL;DR
This paper investigates backdoor attacks on deep neural networks, analyzing their effects on neuron responses, and proposes an $ extit{ll_ty}$-based neuron pruning method to effectively defend against such attacks while maintaining high accuracy.
Contribution
The paper introduces an $ extit{ll_ty}$-based neuron pruning technique that effectively reduces backdoor attack success rates without compromising model accuracy.
Findings
Backdoor attacks cause significant $ll_ty$ bias in neuron activations.
The proposed pruning method reduces attack success rate effectively.
High classification accuracy is maintained on clean images after pruning.
Abstract
Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called \textit{backdoor attack}, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the norm of an activation map compared to its and norm. Spurred by our results, we propose the \textit{-based neuron pruning} to remove the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
