Robust Attacks against Multiple Classifiers
Juan C. Perdomo, Yaron Singer

TL;DR
This paper develops a game-theoretic framework for creating robust adversarial attacks against multiple classifiers, emphasizing the importance of randomization and equilibrium strategies to enhance attack effectiveness.
Contribution
It introduces a novel approach using Nash equilibrium and best response oracles within a Multiplicative Weights framework for optimal adversarial attacks on multiple classifiers.
Findings
Effective attacks against various classifiers demonstrated
Randomized strategies outperform deterministic ones
Applicable to both linear and deep neural network classifiers
Abstract
We address the challenge of designing optimal adversarial noise algorithms for settings where a learner has access to multiple classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two-player, zero-sum game between a learner and an adversary. In doing so, we illustrate the need for randomization in adversarial attacks. In order to compute Nash equilibrium, our main technical focus is on the design of best response oracles that can then be implemented within a Multiplicative Weights Update framework to boost deterministic perturbations against a set of models into optimal mixed strategies. We demonstrate the practical effectiveness of our approach on a series of image classification tasks using both linear classifiers and deep neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
