On Fragile Features and Batch Normalization in Adversarial Training
Nils Philipp Walter, David Stutz, Bernt Schiele

TL;DR
This paper investigates the role of batch normalization in adversarial training, revealing that fine-tuning BN layers can confer moderate robustness by leveraging fragile features, unlike random features.
Contribution
It demonstrates that adversarially fine-tuning only BN layers can improve robustness, highlighting the importance of fragile features in adversarial training.
Findings
Fine-tuning BN layers yields non-trivial adversarial robustness.
Training only BN layers from scratch does not improve robustness.
Fragile features can be exploited for moderate adversarial robustness.
Abstract
Modern deep learning architecture utilize batch normalization (BN) to stabilize training and improve accuracy. It has been shown that the BN layers alone are surprisingly expressive. In the context of robustness against adversarial examples, however, BN is argued to increase vulnerability. That is, BN helps to learn fragile features. Nevertheless, BN is still used in adversarial training, which is the de-facto standard to learn robust features. In order to shed light on the role of BN in adversarial training, we investigate to what extent the expressiveness of BN can be used to robustify fragile features in comparison to random features. On CIFAR10, we find that adversarially fine-tuning just the BN layers can result in non-trivial adversarial robustness. Adversarially training only the BN layers from scratch, in contrast, is not able to convey meaningful adversarial robustness. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
MethodsBatch Normalization
