Understanding and Improving Fast Adversarial Training

Maksym Andriushchenko; Nicolas Flammarion

arXiv:2007.02617·cs.LG·October 27, 2020·45 cites

Understanding and Improving Fast Adversarial Training

Maksym Andriushchenko, Nicolas Flammarion

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates why fast adversarial training methods fail due to catastrophic overfitting, and introduces GradAlign, a regularization technique that improves the robustness and effectiveness of FGSM-based adversarial training.

Contribution

The paper demonstrates that randomness does not prevent overfitting, identifies the causes in simple networks, and proposes GradAlign to enhance fast adversarial training.

Findings

01

Randomness does not prevent catastrophic overfitting.

02

Single filters can cause overfitting in simple networks.

03

GradAlign improves FGSM training for larger perturbations.

Abstract

A recent line of work focused on making adversarial training computationally efficient for deep learning models. In particular, Wong et al. (2020) showed that $ℓ_{\infty}$ -adversarial training with fast gradient sign method (FGSM) can fail due to a phenomenon called "catastrophic overfitting", when the model quickly loses its robustness over a single epoch of training. We show that adding a random step to FGSM, as proposed in Wong et al. (2020), does not prevent catastrophic overfitting, and that randomness is not important per se -- its main role being simply to reduce the magnitude of the perturbation. Moreover, we show that catastrophic overfitting is not inherent to deep and overparametrized networks, but can occur in a single-layer convolutional network with a few filters. In an extreme case, even a single filter can make the network highly non-linear locally, which is the main…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tml-epfl/understanding-fast-adv-training
pytorchOfficial

Videos

Understanding and Improving Fast Adversarial Training· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications