On the (Un-)Avoidability of Adversarial Examples
Sadia Chowdhury, Ruth Urner

TL;DR
This paper proposes a new framework for understanding adversarial examples by defining an adaptive robustness measure that aligns with the data distribution, and introduces a data-augmentation method to improve model reliability.
Contribution
It introduces a locally adaptive robustness framework and an empirical adaptive robust loss, along with a data-augmentation approach that preserves 1-nearest neighbor consistency.
Findings
Adaptive data-augmentation maintains 1-nearest neighbor consistency.
The proposed robustness measure aligns with the data distribution.
Empirical evaluations demonstrate effectiveness of the approach.
Abstract
The phenomenon of adversarial examples in deep learning models has caused substantial concern over their reliability. While many deep neural networks have shown impressive performance in terms of predictive accuracy, it has been shown that in many instances an imperceptible perturbation can falsely flip the network's prediction. Most research has then focused on developing defenses against adversarial attacks or learning under a worst-case adversarial loss. In this work, we take a step back and aim to provide a framework for determining whether a model's label change under small perturbation is justified (and when it is not). We carefully argue that adversarial robustness should be defined as a locally adaptive measure complying with the underlying distribution. We then suggest a definition for an adaptive robust loss, derive an empirical version of it, and develop a resulting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFLIP
