A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples
Beilun Wang, Ji Gao, Yanjun Qi

TL;DR
This paper develops a topological theoretical framework to understand why classifiers, especially deep neural networks, are vulnerable to adversarial examples and identifies key conditions for achieving robustness.
Contribution
It introduces a topological analysis of classifier robustness, providing necessary and sufficient conditions for strong robustness against adversarial examples.
Findings
Unnecessary features can compromise robustness.
Proper feature representation is crucial for robustness.
Topological conditions determine classifier vulnerability.
Abstract
Most machine learning classifiers, including deep neural networks, are vulnerable to adversarial examples. Such inputs are typically generated by adding small but purposeful modifications that lead to incorrect outputs while imperceptible to human eyes. The goal of this paper is not to introduce a single method, but to make theoretical steps towards fully understanding adversarial examples. By using concepts from topology, our theoretical analysis brings forth the key reasons why an adversarial example can fool a classifier () and adds its oracle (, like human eyes) in such analysis. By investigating the topological relationship between two (pseudo)metric spaces corresponding to predictor and oracle , we develop necessary and sufficient conditions that can determine if is always robust (strong-robust) against adversarial examples according to .…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
