Nash Equilibria and Pitfalls of Adversarial Training in Adversarial   Robustness Games

Maria-Florina Balcan; Rattana Pukdee; Pradeep Ravikumar; Hongyang; Zhang

arXiv:2210.12606·cs.LG·March 1, 2023

Nash Equilibria and Pitfalls of Adversarial Training in Adversarial Robustness Games

Maria-Florina Balcan, Rattana Pukdee, Pradeep Ravikumar, Hongyang, Zhang

PDF

Open Access

TL;DR

This paper analyzes adversarial training as a game, proving that it may not converge but a robust Nash equilibrium exists, supported by theoretical proofs and experiments.

Contribution

It models adversarial training as a game, proving non-convergence of common strategies and establishing the existence of a robust Nash equilibrium.

Findings

01

Adversarial training may not converge in simple models.

02

A unique robust Nash equilibrium exists.

03

Experiments confirm non-convergence and robustness of equilibrium.

Abstract

Adversarial training is a standard technique for training adversarially robust models. In this paper, we study adversarial training as an alternating best-response strategy in a 2-player zero-sum game. We prove that even in a simple scenario of a linear classifier and a statistical model that abstracts robust vs. non-robust features, the alternating best response strategy of such game may not converge. On the other hand, a unique pure Nash equilibrium of the game exists and is provably robust. We support our theoretical results with experiments, showing the non-convergence of adversarial training and the robustness of Nash equilibrium.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Statistical Methods and Models