How adversarial attacks can disrupt seemingly stable accurate   classifiers

Oliver J. Sutton; Qinghua Zhou; Ivan Y. Tyukin; Alexander N. Gorban,; Alexander Bastounis; Desmond J. Higham

arXiv:2309.03665·cs.LG·September 13, 2024

How adversarial attacks can disrupt seemingly stable accurate classifiers

Oliver J. Sutton, Qinghua Zhou, Ivan Y. Tyukin, Alexander N. Gorban,, Alexander Bastounis, Desmond J. Higham

PDF

Open Access

TL;DR

This paper reveals that high-dimensional classifiers are inherently susceptible to small adversarial perturbations despite robustness to random noise, highlighting fundamental vulnerabilities in neural networks.

Contribution

The authors introduce a general framework explaining why classifiers are vulnerable to adversarial attacks while remaining robust to random noise, supported by empirical evidence.

Findings

01

High-dimensional classifiers are prone to adversarial attacks despite robustness to random noise.

02

Random perturbations are ineffective for detecting adversarial examples.

03

Adversarial training is necessary to mitigate vulnerabilities.

Abstract

Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are robust to large random perturbations of the input data remain susceptible to small, easily constructed, adversarial perturbations of their inputs. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability -- notably the simultaneous susceptibility of the (otherwise accurate) model to easily constructed adversarial attacks, and robustness to random perturbations of the input data. We confirm that the same phenomena are directly observed in practical neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Integrated Circuits and Semiconductor Failure Analysis