Detecting Adversarial Samples from Artifacts

Reuben Feinman; Ryan R. Curtin; Saurabh Shintre; Andrew B. Gardner

arXiv:1703.00410·stat.ML·November 17, 2017·378 cites

Detecting Adversarial Samples from Artifacts

Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, Andrew B. Gardner

PDF

Open Access 3 Repos

TL;DR

This paper proposes a method to detect adversarial samples in neural networks by analyzing model confidence and feature density, achieving high ROC-AUC scores across datasets and attack types.

Contribution

It introduces an attack-agnostic implicit adversarial detection method using Bayesian uncertainty and deep feature density estimation.

Findings

01

Achieves 85-93% ROC-AUC in detecting adversarial samples.

02

Generalizes well across different datasets and neural network architectures.

03

Effective against various attack algorithms.

Abstract

Deep neural networks (DNNs) are powerful nonlinear architectures that are known to be robust to random perturbations of the input. However, these models are vulnerable to adversarial perturbations--small input changes crafted explicitly to fool the model. In this paper, we ask whether a DNN can distinguish adversarial samples from their normal and noisy counterparts. We investigate model confidence on adversarial samples by looking at Bayesian uncertainty estimates, available in dropout neural networks, and by performing density estimation in the subspace of deep features learned by the model. The result is a method for implicit adversarial detection that is oblivious to the attack algorithm. We evaluate this method on a variety of standard datasets including MNIST and CIFAR-10 and show that it generalizes well across different architectures and attacks. Our findings report that 85-93%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Forensic and Genetic Research

MethodsDropout