Exploring the Space of Black-box Attacks on Deep Neural Networks
Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song

TL;DR
This paper introduces novel Gradient Estimation black-box attacks on deep neural networks that do not depend on transferability, achieving high success rates with fewer queries and outperforming existing methods on standard datasets and real-world classifiers.
Contribution
The paper presents a new Gradient Estimation black-box attack method that reduces query complexity and surpasses transferability-based attacks in effectiveness.
Findings
Achieves near 100% success rate on DNNs for both targeted and untargeted attacks.
Outperforms transferability-based black-box attacks on MNIST and CIFAR-10.
Remains effective against state-of-the-art defenses and real-world classifiers.
Abstract
Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can "transfer" to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model's class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial sample from the dimensionality of the input. An iterative variant of our attack achieves close to 100% adversarial success rates for both targeted and untargeted attacks on DNNs. We carry out extensive experiments for a thorough comparative evaluation of black-box attacks and show that the proposed Gradient Estimation attacks outperform all transferability based black-box attacks we tested…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Cardiac Arrest and Resuscitation
