On the (Statistical) Detection of Adversarial Examples

Kathrin Grosse; Praveen Manoharan; Nicolas Papernot; Michael Backes,; Patrick McDaniel

arXiv:1702.06280·cs.CR·October 18, 2017·377 cites

On the (Statistical) Detection of Adversarial Examples

Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes,, Patrick McDaniel

PDF

Open Access

TL;DR

This paper demonstrates that adversarial examples differ statistically from original data and can be detected using statistical tests, while also proposing an augmented model to identify such inputs with high accuracy.

Contribution

The paper introduces a statistical detection method for adversarial examples and an augmented model to classify them, improving detection accuracy and robustness.

Findings

01

Statistical tests can detect adversarial examples with sample sizes between 10 and 100.

02

Augmented models classify adversarial inputs with over 80% accuracy.

03

Perturbation required for successful attacks increases by more than 150% when detection is employed.

Abstract

Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understanding adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. Using thus knowledge, we introduce a complimentary approach to identify specific inputs that are adversarial. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications