TL;DR
This paper introduces improved Jacobian-based saliency map attacks (WJSMA and TJSMA) that are faster and more effective for generating sparse adversarial examples, outperforming or matching existing methods like CW $L_0$ in efficiency.
Contribution
The paper proposes novel variants of JSMA that incorporate input and output features, significantly enhancing attack speed and effectiveness for $L_0$ adversarial attacks.
Findings
WJSMA and TJSMA are over 50 times faster than CW $L_0$ on CIFAR-10.
The new attacks outperform original JSMA in efficiency and sometimes match CW $L_0$ in effectiveness.
Experiments on MNIST, CIFAR-10, and GTSRB datasets validate the improved performance.
Abstract
Neural network classifiers (NNCs) are known to be vulnerable to malicious adversarial perturbations of inputs including those modifying a small fraction of the input features named sparse or attacks. Effective and fast attacks, such as the widely used Jacobian-based Saliency Map Attack (JSMA) are practical to fool NNCs but also to improve their robustness. In this paper, we show that penalising saliency maps of JSMA by the output probabilities and the input features of the NNC allows to obtain more powerful attack algorithms that better take into account each input's characteristics. This leads us to introduce improved versions of JSMA, named Weighted JSMA (WJSMA) and Taylor JSMA (TJSMA), and demonstrate through a variety of white-box and black-box experiments on three different datasets (MNIST, CIFAR-10 and GTSRB), that they are both significantly faster and more efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
