Probabilistic Modeling of Deep Features for Out-of-Distribution and   Adversarial Detection

Nilesh A. Ahuja; Ibrahima Ndiour; Trushant Kalyanpur; Omesh Tickoo

arXiv:1909.11786·stat.ML·September 27, 2019·37 cites

Probabilistic Modeling of Deep Features for Out-of-Distribution and Adversarial Detection

Nilesh A. Ahuja, Ibrahima Ndiour, Trushant Kalyanpur, Omesh Tickoo

PDF

Open Access

TL;DR

This paper introduces a probabilistic approach to detect out-of-distribution and adversarial samples in deep neural networks by modeling deep features with parametric distributions, significantly improving detection accuracy across image and video data.

Contribution

The paper proposes a novel method using Gaussian and Gaussian mixture models to analyze deep features for OOD and adversarial detection, demonstrating its effectiveness on multiple datasets and architectures.

Findings

01

Improved detection of OOD and adversarial samples by up to 12 percentage points in AUPR and AUROC.

02

Effective application to both image and video data, including state-of-the-art adversarial attacks.

03

First methodology reported for reliably detecting white-box adversarial framing in video classifiers.

Abstract

We present a principled approach for detecting out-of-distribution (OOD) and adversarial samples in deep neural networks. Our approach consists in modeling the outputs of the various layers (deep features) with parametric probability distributions once training is completed. At inference, the likelihoods of the deep features w.r.t the previously learnt distributions are calculated and used to derive uncertainty estimates that can discriminate in-distribution samples from OOD samples. We explore the use of two classes of multivariate distributions for modeling the deep features - Gaussian and Gaussian mixture - and study the trade-off between accuracy and computational complexity. We demonstrate benefits of our approach on image features by detecting OOD images and adversarially-generated images, using popular DNN architectures on MNIST and CIFAR10 datasets. We show that more precise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis