Uniform Convergence of Adversarially Robust Classifiers
Rachel Morris, Ryan Murray

TL;DR
This paper establishes that in a large data limit, adversarially-robust classifiers converge to the optimal Bayes classifier as adversarial strength diminishes, using geometric measure theory techniques.
Contribution
It provides a general framework showing convergence of adversarial classifiers to Bayes classifier in the Hausdorff distance, extending previous $L^1$-type results.
Findings
Optimal classifiers converge to Bayes classifier as adversarial strength approaches zero.
Convergence is in Hausdorff distance, strengthening previous $L^1$-type results.
Geometric measure theory techniques are used for the proof.
Abstract
In recent years there has been significant interest in the effect of different types of adversarial perturbations in data classification problems. Many of these models incorporate the adversarial power, which is an important parameter with an associated trade-off between accuracy and robustness. This work considers a general framework for adversarially-perturbed classification problems, in a large data or population-level limit. In such a regime, we demonstrate that as adversarial strength goes to zero that optimal classifiers converge to the Bayes classifier in the Hausdorff distance. This significantly strengthens previous results, which generally focus on -type convergence. The main argument relies upon direct geometric comparisons and is inspired by techniques from geometric measure theory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsFocus
