Hard-label Manifolds: Unexpected Advantages of Query Efficiency for Finding On-manifold Adversarial Examples
Washington Garcia, Pin-Yu Chen, Somesh Jha, Scott Clouse, Kevin R. B., Butler

TL;DR
This paper explores how query efficiency in hard-label adversarial attacks relates to traversing the data manifold, revealing that such attacks can produce more realistic adversarial examples and informing future robust model design.
Contribution
It introduces an information-theoretic framework linking query efficiency to manifold traversal and demonstrates this behavior through experiments on real datasets and attacks.
Findings
Query efficiency correlates with closer proximity to the data manifold.
Zeroth-order attacks can produce samples with reduced manifold distance.
Manifold-gradient mutual information can guide robust model development.
Abstract
Designing deep networks robust to adversarial examples remains an open problem. Likewise, recent zeroth order hard-label attacks on image classification models have shown comparable performance to their first-order, gradient-level alternatives. It was recently shown in the gradient-level setting that regular adversarial examples leave the data manifold, while their on-manifold counterparts are in fact generalization errors. In this paper, we argue that query efficiency in the zeroth-order setting is connected to an adversary's traversal through the data manifold. To explain this behavior, we propose an information-theoretic argument based on a noisy manifold distance oracle, which leaks manifold information through the adversary's gradient estimate. Through numerical experiments of manifold-gradient mutual information, we show this behavior acts as a function of the effective problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
