Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing
Jingyi Wang, Jun Sun, Peixin Zhang, Xinyu Wang

TL;DR
This paper introduces nMutant, a mutation testing-inspired statistical method to detect adversarial samples in deep neural networks by exploiting their higher sensitivity to perturbations, enhancing security in safety-critical applications.
Contribution
It proposes a novel detection technique for adversarial samples based on mutation testing principles, demonstrating effectiveness against recent attack methods.
Findings
nMutant effectively detects most adversarial samples
The method provides a statistical error bound with significance
Adversarial samples are more sensitive to perturbations than normal samples
Abstract
Recently, it has been shown that deep neural networks (DNN) are subject to attacks through adversarial samples. Adversarial samples are often crafted through adversarial perturbation, i.e., manipulating the original sample with minor modifications so that the DNN model labels the sample incorrectly. Given that it is almost impossible to train perfect DNN, adversarial samples are shown to be easy to generate. As DNN are increasingly used in safety-critical systems like autonomous cars, it is crucial to develop techniques for defending such attacks. Existing defense mechanisms which aim to make adversarial perturbation challenging have been shown to be ineffective. In this work, we propose an alternative approach. We first observe that adversarial samples are much more sensitive to perturbations than normal samples. That is, if we impose random perturbations on a normal and an adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · Physical Unclonable Functions (PUFs) and Hardware Security
