Wasserstein distributional robustness of neural networks
Xingjian Bai, Guangyi He, Yifan Jiang, Jan Obloj

TL;DR
This paper introduces a Wasserstein distributionally robust optimization framework for neural networks, proposing new attack algorithms and providing bounds on adversarial accuracy under distributional threats, with empirical validation on CIFAR-10.
Contribution
It develops a novel Wasserstein DRO approach for neural network robustness, including new attack algorithms and asymptotic accuracy bounds, extending beyond traditional pointwise attacks.
Findings
Proposed first-order attack algorithms including FGSM and PGD as special cases.
Provided a fast, first-order accurate asymptotic estimate of adversarial accuracy.
Validated theoretical results with experiments on CIFAR-10 dataset.
Abstract
Deep neural networks are known to be vulnerable to adversarial attacks (AA). For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified. Design of such attacks as well as methods of adversarial training against them are subject of intense research. We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions leveraging recent insights from DRO sensitivity analysis. We consider a set of distributional threat models. Unlike the traditional pointwise attacks, which assume a uniform bound on perturbation of each input data point, distributional threat models allow attackers to perturb inputs in a non-uniform way. We link these more general attacks with questions of out-of-sample performance and Knightian uncertainty. To evaluate the distributional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced X-ray and CT Imaging
