Exploring the Space of Black-box Attacks on Deep Neural Networks

Arjun Nitin Bhagoji; Warren He; Bo Li; Dawn Song

arXiv:1712.09491·cs.LG·December 29, 2017·70 cites

Exploring the Space of Black-box Attacks on Deep Neural Networks

Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song

PDF

Open Access 1 Repo

TL;DR

This paper introduces novel Gradient Estimation black-box attacks on deep neural networks that do not depend on transferability, achieving high success rates with fewer queries and outperforming existing methods on standard datasets and real-world classifiers.

Contribution

The paper presents a new Gradient Estimation black-box attack method that reduces query complexity and surpasses transferability-based attacks in effectiveness.

Findings

01

Achieves near 100% success rate on DNNs for both targeted and untargeted attacks.

02

Outperforms transferability-based black-box attacks on MNIST and CIFAR-10.

03

Remains effective against state-of-the-art defenses and real-world classifiers.

Abstract

Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can "transfer" to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model's class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial sample from the dimensionality of the input. An iterative variant of our attack achieves close to 100% adversarial success rates for both targeted and untargeted attacks on DNNs. We carry out extensive experiments for a thorough comparative evaluation of black-box attacks and show that the proposed Gradient Estimation attacks outperform all transferability based black-box attacks we tested…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunblaze-ucb/blackbox-attacks
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Cardiac Arrest and Resuscitation