Understanding Intrinsic Robustness Using Label Uncertainty
Xiao Zhang, David Evans

TL;DR
This paper investigates the intrinsic robustness of classifiers by incorporating label uncertainty into concentration measures, revealing that standard methods underestimate robustness due to ignoring label information.
Contribution
It introduces a novel label uncertainty measure and adapts concentration estimation to better assess intrinsic robustness in image classification tasks.
Findings
Error regions have higher label uncertainty than random subsets.
Adjusted concentration estimates provide more accurate robustness metrics.
Abstract
A fundamental question in adversarial machine learning is whether a robust classifier exists for a given task. A line of research has made some progress towards this goal by studying the concentration of measure, but we argue standard concentration fails to fully characterize the intrinsic robustness of a classification problem since it ignores data labels which are essential to any classification task. Building on a novel definition of label uncertainty, we empirically demonstrate that error regions induced by state-of-the-art models tend to have much higher label uncertainty than randomly-selected subsets. This observation motivates us to adapt a concentration estimation algorithm to account for label uncertainty, resulting in more accurate intrinsic robustness measures for benchmark image classification problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
