A Framework for Verification of Wasserstein Adversarial Robustness
Tobias Wegel, Felix Assion, David Mickisch, Florens Gre{\ss}ner

TL;DR
This paper introduces a framework for verifying Wasserstein adversarial robustness in image classifiers, enabling certification and attack methods that better align with human perception, along with a new efficient attack algorithm.
Contribution
It extends existing certification techniques to Wasserstein threat models and proposes a novel, computationally efficient Wasserstein adversarial attack method.
Findings
Framework allows transfer of certification methods to Wasserstein models
Proposed attack reduces computational burden significantly
Certification can be complete or incomplete depending on the model choice
Abstract
Machine learning image classifiers are susceptible to adversarial and corruption perturbations. Adding imperceptible noise to images can lead to severe misclassifications of the machine learning model. Using -norms for measuring the size of the noise fails to capture human similarity perception, which is why optimal transport based distance measures like the Wasserstein metric are increasingly being used in the field of adversarial robustness. Verifying the robustness of classifiers using the Wasserstein metric can be achieved by proving the absence of adversarial examples (certification) or proving their presence (attack). In this work we present a framework based on the work by Levine and Feizi, which allows us to transfer existing certification methods for convex polytopes or -balls to the Wasserstein threat model. The resulting certification can be complete or incomplete,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
