Simple Black-Box Adversarial Perturbations for Deep Networks
Nina Narodytska, Shiva Prasad Kasiviswanathan

TL;DR
This paper demonstrates that deep neural networks are vulnerable to simple black-box adversarial attacks that require minimal knowledge and perturb only a few pixels, highlighting a significant security concern.
Contribution
The authors introduce simple, effective black-box attack methods that craft adversarial examples by perturbing a small number of pixels without internal network knowledge.
Findings
Elementary attacks can successfully fool deep networks
Perturbing a few pixels causes misclassification
Attacks are effective even with minimal information
Abstract
Deep neural networks are powerful and popular learning models that achieve state-of-the-art pattern recognition performance on many computer vision, speech, and language processing tasks. However, these networks have also been shown susceptible to carefully crafted adversarial perturbations which force misclassification of the inputs. Adversarial examples enable adversaries to subvert the expected system behavior leading to undesired consequences and could pose a security risk when these systems are deployed in the real world. In this work, we focus on deep convolutional neural networks and demonstrate that adversaries can easily craft adversarial examples even without any internal knowledge of the target network. Our attacks treat the network as an oracle (black-box) and only assume that the output of the network can be observed on the probed inputs. Our first attack is based on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
