Explaining and Harnessing Adversarial Examples

Ian J. Goodfellow; Jonathon Shlens; Christian Szegedy

arXiv:1412.6572·stat.ML·March 24, 2015·ICLR·8.1k cites

Explaining and Harnessing Adversarial Examples

Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy

PDF

Open Access 5 Repos 1 Models 1 Video

TL;DR

This paper argues that the linear nature of neural networks primarily causes their vulnerability to adversarial examples, and introduces a simple method to generate such examples, improving robustness through adversarial training.

Contribution

It provides a new linearity-based explanation for adversarial vulnerability and presents a fast method for generating adversarial examples that enhances model robustness.

Findings

01

Linear nature explains adversarial vulnerability

02

New quantitative support for the explanation

03

Adversarial training reduces test error on MNIST

Abstract

Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
deepcode-ai/diffai
model

Videos

Adversarial Machine Learning explained! | With examples.· youtube

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications