A practical approach to evaluating the adversarial distance for machine learning classifiers
Georg Siedel, Ekagra Gupta, Andrey Morozov

TL;DR
This paper proposes a comprehensive method combining iterative attacks and certification to evaluate the adversarial distance of ML classifiers, providing more informative robustness metrics than traditional accuracy measures.
Contribution
It introduces a novel approach to estimate upper and lower bounds of adversarial distance, improving robustness evaluation for complex models and high-dimensional data.
Findings
The attack method outperforms related implementations.
The certification approach was less effective than expected.
The proposed evaluation provides deeper insights into model robustness.
Abstract
Robustness is critical for machine learning (ML) classifiers to ensure consistent performance in real-world applications where models may encounter corrupted or adversarial inputs. In particular, assessing the robustness of classifiers to adversarial inputs is essential to protect systems from vulnerabilities and thus ensure safety in use. However, methods to accurately compute adversarial robustness have been challenging for complex ML models and high-dimensional data. Furthermore, evaluations typically measure adversarial accuracy on specific attack budgets, limiting the informative value of the resulting metrics. This paper investigates the estimation of the more informative adversarial distance using iterative adversarial attacks and a certification approach. Combined, the methods provide a comprehensive evaluation of adversarial robustness by computing estimates for the upper and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
