Reducing Adversarial Training Cost with Gradient Approximation
Huihui Gong

TL;DR
This paper introduces GAAT, a novel adversarial training method that approximates gradients to significantly reduce training time while maintaining model robustness and accuracy.
Contribution
The paper proposes a gradient approximation technique for adversarial training, substantially decreasing computational costs without sacrificing model performance.
Findings
GAAT reduces training time by up to 60%.
Models trained with GAAT maintain comparable accuracy on natural and adversarial examples.
The method is effective across MNIST, CIFAR-10, and CIFAR-100 datasets.
Abstract
Deep learning models have achieved state-of-the-art performances in various domains, while they are vulnerable to the inputs with well-crafted but small perturbations, which are named after adversarial examples (AEs). Among many strategies to improve the model robustness against AEs, Projected Gradient Descent (PGD) based adversarial training is one of the most effective methods. Unfortunately, the prohibitive computational overhead of generating strong enough AEs, due to the maximization of the loss function, sometimes makes the regular PGD adversarial training impractical when using larger and more complicated models. In this paper, we propose that the adversarial loss can be approximated by the partial sum of Taylor series. Furthermore, we approximate the gradient of adversarial loss and propose a new and efficient adversarial training method, adversarial training with gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
