Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples   in Pre-trained CNNs

Arezoo Rajabi; Rakesh B. Bobba

arXiv:2011.09123·cs.CV·November 19, 2020

Adversarial Profiles: Detecting Out-Distribution & Adversarial Samples in Pre-trained CNNs

Arezoo Rajabi, Rakesh B. Bobba

PDF

Open Access

TL;DR

This paper introduces a novel method for detecting adversarial and out-distribution samples in pre-trained CNNs without retraining, using class-specific adversarial profiles created from a single attack technique, showing promising initial results.

Contribution

The authors propose a new detection approach that does not require retraining or extensive fooling examples, utilizing class-specific adversarial profiles for effective detection.

Findings

01

Detects at least 92% of out-distribution examples

02

Detects 59% of adversarial examples

03

Effective on MNIST dataset

Abstract

Despite high accuracy of Convolutional Neural Networks (CNNs), they are vulnerable to adversarial and out-distribution examples. There are many proposed methods that tend to detect or make CNNs robust against these fooling examples. However, most such methods need access to a wide range of fooling examples to retrain the network or to tune detection parameters. Here, we propose a method to detect adversarial and out-distribution examples against a pre-trained CNN without needing to retrain the CNN or needing access to a wide variety of fooling examples. To this end, we create adversarial profiles for each class using only one adversarial attack generation technique. We then wrap a detector around the pre-trained CNN that applies the created adversarial profile to each input and uses the output to decide whether or not the input is legitimate. Our initial evaluation of this approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Electrostatic Discharge in Electronics