Analysis of classifiers' robustness to adversarial perturbations
Alhussein Fawzi, Omar Fawzi, Pascal Frossard

TL;DR
This paper provides a theoretical analysis of the robustness limits of classifiers to adversarial perturbations, explaining the instability observed in deep networks and highlighting the fundamental role of task difficulty and classifier flexibility.
Contribution
It introduces the first theoretical framework for understanding adversarial vulnerability, establishing upper bounds on classifier robustness based on task distinguishability.
Findings
Robustness upper bounds depend on task difficulty and classifier flexibility.
Adversarial robustness is fundamentally limited in low distinguishability tasks.
Robustness to random noise exceeds adversarial robustness by a factor proportional to rom the signal dimension.
Abstract
The goal of this paper is to analyze an intriguing phenomenon recently discovered in deep networks, namely their instability to adversarial perturbations (Szegedy et. al., 2014). We provide a theoretical framework for analyzing the robustness of classifiers to adversarial perturbations, and show fundamental upper bounds on the robustness of classifiers. Specifically, we establish a general upper bound on the robustness of classifiers to adversarial perturbations, and then illustrate the obtained upper bound on the families of linear and quadratic classifiers. In both cases, our upper bound depends on a distinguishability measure that captures the notion of difficulty of the classification task. Our results for both classes imply that in tasks involving small distinguishability, no classifier in the considered set will be robust to adversarial perturbations, even if a good accuracy is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
