NNoculation: Catching BadNets in the Wild
Akshaj Kumar Veldanda, Kang Liu, Benjamin Tan, Prashanth, Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt, and, Siddharth Garg

TL;DR
NNoculation is a two-stage defense mechanism for neural networks that repairs backdoored models pre-deployment and detects, quarantines, and retrains on backdoored inputs post-deployment, outperforming existing defenses.
Contribution
The paper introduces NNoculation, a novel two-stage approach combining model repair and input detection to defend against diverse backdoor attacks in neural networks.
Findings
Outperforms state-of-the-art defenses on various backdoor attacks
Effective even against adaptive attacks that bypass other defenses
Minimal assumptions required for successful defense
Abstract
This paper proposes a novel two-stage defense (NNoculation) against backdoored neural networks (BadNets) that, repairs a BadNet both pre-deployment and online in response to backdoored test inputs encountered in the field. In the pre-deployment stage, NNoculation retrains the BadNet with random perturbations of clean validation inputs to partially reduce the adversarial impact of a backdoor. Post-deployment, NNoculation detects and quarantines backdoored test inputs by recording disagreements between the original and pre-deployment patched networks. A CycleGAN is then trained to learn transformations between clean validation and quarantined inputs; i.e., it learns to add triggers to clean validation images. Backdoored validation images along with their correct labels are used to further retrain the pre-deployment patched network, yielding our final defense. Empirical evaluation on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
MethodsTest · Batch Normalization · Residual Connection · PatchGAN · *Communicated@Fast*How Do I Communicate to Expedia? · Tanh Activation · Residual Block · Instance Normalization · Convolution · HuMan(Expedia)||How do I get a human at Expedia?
