When adversarial examples are excusable
Pieter-Jan Kindermans, Charles Staats

TL;DR
This paper investigates the nature of adversarial errors in neural networks, revealing that many are near the true decision boundary and can be mitigated by constraining inputs to the data manifold, making adversarial mistakes more understandable.
Contribution
The study demonstrates that constraining adversarial examples to the data manifold significantly reduces errors and makes remaining adversarial mistakes more similar to genuine test errors.
Findings
Adversarial errors are often near the ground truth decision boundary.
Constraining inputs to the data manifold reduces adversarial errors by about 90%.
Remaining adversarial errors resemble difficult but justifiable test errors.
Abstract
Neural networks work remarkably well in practice and theoretically they can be universal approximators. However, they still make mistakes and a specific type of them called adversarial errors seem inexcusable to humans. In this work, we analyze both test errors and adversarial errors on a well controlled but highly non-linear visual classification problem. We find that, when approximating training on infinite data, test errors tend to be close to the ground truth decision boundary. Qualitatively speaking these are also more difficult for a human. By contrast, adversarial examples can be found almost everywhere and are often obvious mistakes. However, when we constrain adversarial examples to the manifold, we observe a 90\% reduction in adversarial errors. If we inflate the manifold by training with Gaussian noise we observe a similar effect. In both cases, the remaining adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
