# Adversarial Sample Detection for Deep Neural Network through Model   Mutation Testing

**Authors:** Jingyi Wang, Guoliang Dong, Jun Sun, Xinyu Wang, Peixin, Zhang

arXiv: 1812.05793 · 2019-11-22

## TL;DR

This paper introduces a runtime detection method for adversarial samples in deep neural networks by measuring their sensitivity to random model mutations, effectively distinguishing adversarial from normal inputs.

## Contribution

It proposes a novel sensitivity-based detection approach using model mutation testing and statistical hypothesis testing, improving adversarial sample detection accuracy.

## Key findings

- Effective detection of adversarial samples on MNIST and CIFAR10
- High detection accuracy with low false positives
- Detects samples generated by state-of-the-art attacks

## Abstract

Deep neural networks (DNN) have been shown to be useful in a wide range of applications. However, they are also known to be vulnerable to adversarial samples. By transforming a normal sample with some carefully crafted human imperceptible perturbations, even highly accurate DNN make wrong decisions. Multiple defense mechanisms have been proposed which aim to hinder the generation of such adversarial samples. However, a recent work show that most of them are ineffective. In this work, we propose an alternative approach to detect adversarial samples at runtime. Our main observation is that adversarial samples are much more sensitive than normal samples if we impose random mutations on the DNN. We thus first propose a measure of `sensitivity' and show empirically that normal samples and adversarial samples have distinguishable sensitivity. We then integrate statistical hypothesis testing and model mutation testing to check whether an input sample is likely to be normal or adversarial at runtime by measuring its sensitivity. We evaluated our approach on the MNIST and CIFAR10 datasets. The results show that our approach detects adversarial samples generated by state-of-the-art attacking methods efficiently and accurately.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.05793/full.md

## Figures

26 figures with captions in the complete paper: https://tomesphere.com/paper/1812.05793/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/1812.05793/full.md

---
Source: https://tomesphere.com/paper/1812.05793