Batch Normalization is a Cause of Adversarial Vulnerability
Angus Galloway, Anna Golubeva, Thomas Tanay, Medhat Moussa, and Graham W. Taylor

TL;DR
This paper demonstrates that batch normalization, while aiding training, increases neural network vulnerability to adversarial attacks, and replacing it with weight decay can mitigate this effect.
Contribution
It reveals that batch normalization contributes to adversarial vulnerability and shows that replacing it with weight decay can reduce this risk.
Findings
Batch norm increases adversarial vulnerability by double digits.
Replacing batch norm with weight decay nullifies the vulnerability related to input dimension.
Mean-field analysis supports that batch norm causes exploding gradients.
Abstract
Batch normalization (batch norm) is often used in an attempt to stabilize and accelerate training in deep neural networks. In many cases it indeed decreases the number of parameter updates required to achieve low training error. However, it also reduces robustness to small adversarial input perturbations and noise by double-digit percentages, as we show on five standard datasets. Furthermore, substituting weight decay for batch norm is sufficient to nullify the relationship between adversarial vulnerability and the input dimension. Our work is consistent with a mean-field analysis that found that batch norm causes exploding gradients.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Integrated Circuits and Semiconductor Failure Analysis
MethodsWeight Decay
