Understanding Adversarial Training: Increasing Local Stability of Neural   Nets through Robust Optimization

Uri Shaham; Yutaro Yamada; and Sahand Negahban

arXiv:1511.05432·stat.ML·May 7, 2018

Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization

Uri Shaham, Yutaro Yamada, and Sahand Negahban

PDF

TL;DR

This paper introduces a robust optimization framework for adversarial training of neural networks, enhancing their local stability and robustness against adversarial examples while also improving test accuracy.

Contribution

It presents a general alternating minimization-maximization framework that unifies and extends previous adversarial training methods for neural networks.

Findings

01

Increases neural network robustness to adversarial examples.

02

Makes it harder to generate new adversarial examples.

03

Improves accuracy on original test data.

Abstract

We propose a general framework for increasing local stability of Artificial Neural Nets (ANNs) using Robust Optimization (RO). We achieve this through an alternating minimization-maximization procedure, in which the loss of the network is minimized over perturbed examples that are generated at each parameter update. We show that adversarial training of ANNs is in fact robustification of the network optimization, and that our proposed framework generalizes previous approaches for increasing local stability of ANNs. Experimental results reveal that our approach increases the robustness of the network to existing adversarial examples, while making it harder to generate new ones. Furthermore, our algorithm improves the accuracy of the network also on the original test data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.