Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples
Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli

TL;DR
This paper systematically investigates the limits of adversarial training for deep neural networks, demonstrating that combining larger models, specific activations, and model averaging significantly improves robustness against norm-bounded adversarial examples.
Contribution
It uncovers how model size, activation functions, and unlabeled data can be combined to surpass previous adversarial robustness benchmarks.
Findings
Achieved state-of-the-art robustness on CIFAR-10 and CIFAR-100.
Large improvements in accuracy under attack with model and data modifications.
Demonstrated robustness gains without additional modifications across different norms.
Abstract
Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in a bid to uncover its limits. We systematically study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness. We discover that it is possible to train robust models that go well beyond state-of-the-art results by combining larger models, Swish/SiLU activations and model weight averaging. We demonstrate large improvements on CIFAR-10 and CIFAR-100 against and norm-bounded perturbations of size and , respectively. In the setting with additional unlabeled data, we obtain an accuracy under attack of 65.88% against perturbations of size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Advanced Malware Detection Techniques
