Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks
Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg

TL;DR
This paper introduces fine-pruning, a defense method combining pruning and fine-tuning, which effectively mitigates backdoor attacks in deep neural networks with minimal impact on accuracy.
Contribution
It presents the first effective defense against backdoor attacks on DNNs by combining pruning and fine-tuning, demonstrating significant reduction in attack success rate.
Findings
Fine-pruning can reduce attack success rate to 0%.
Minimal accuracy drop of 0.4% on clean inputs.
Pruning alone is insufficient against sophisticated backdoors.
Abstract
Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
MethodsPruning
