A Unified Wasserstein Distributional Robustness Framework for Adversarial Training
Tuan Anh Bui, Trung Le, Quan Tran, He Zhao, Dinh Phung

TL;DR
This paper introduces a unified Wasserstein distributional robustness framework that generalizes existing adversarial training methods, leading to more robust deep neural networks against adversarial attacks.
Contribution
It presents a novel framework connecting Wasserstein distributional robustness with current adversarial training methods, enabling new algorithms and improved robustness.
Findings
Distributional robustness AT algorithms outperform standard AT in robustness.
The framework unifies and generalizes existing adversarial training methods.
Extensive experiments demonstrate enhanced robustness of the proposed algorithms.
Abstract
It is well-known that deep neural networks (DNNs) are susceptible to adversarial attacks, exposing a severe fragility of deep learning systems. As the result, adversarial training (AT) method, by incorporating adversarial examples during training, represents a natural and effective approach to strengthen the robustness of a DNN-based classifier. However, most AT-based methods, notably PGD-AT and TRADES, typically seek a pointwise adversary that generates the worst-case adversarial example by independently perturbing each data sample, as a way to "probe" the vulnerability of the classifier. Arguably, there are unexplored benefits in considering such adversarial effects from an entire distribution. To this end, this paper presents a unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods. We introduce a new Wasserstein cost function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
