Adversarial Examples and Metrics
Nico D\"ottling, Kathrin Grosse, Michael Backes, Ian Molloy

TL;DR
This paper investigates the limitations of robust classification under uncertain target metrics, revealing that small classifiers can succeed if the metric is known beforehand but fail if the metric is revealed afterward, and links this to cryptography.
Contribution
It introduces a new perspective on adversarial robustness by considering uncertain metrics and connects robust classification hardness to cryptographic models.
Findings
Robust classification is feasible with known metrics for small classifiers.
Robust classification becomes impossible for small classifiers when the metric is uncertain.
Establishes a novel connection between cryptography and the hardness of robust classification.
Abstract
Adversarial examples are a type of attack on machine learning (ML) systems which cause misclassification of inputs. Achieving robustness against adversarial examples is crucial to apply ML in the real world. While most prior work on adversarial examples is empirical, a recent line of work establishes fundamental limitations of robust classification based on cryptographic hardness. Most positive and negative results in this field however assume that there is a fixed target metric which constrains the adversary, and we argue that this is often an unrealistic assumption. In this work we study the limitations of robust classification if the target metric is uncertain. Concretely, we construct a classification problem, which admits robust classification by a small classifier if the target metric is known at the time the model is trained, but for which robust classification is impossible for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Advanced Malware Detection Techniques
