Detecting Adversarial Examples from Sensitivity Inconsistency of   Spatial-Transform Domain

Jinyu Tian; Jiantao Zhou; Yuanman Li; Jia Duan

arXiv:2103.04302·cs.LG·March 9, 2021·6 cites

Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain

Jinyu Tian, Jiantao Zhou, Yuanman Li, Jia Duan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel adversarial example detection method based on the sensitivity inconsistency between a primal classifier and a dual classifier, improving detection accuracy especially for small perturbations.

Contribution

The paper proposes the Sensitivity Inconsistency Detector (SID), leveraging the sensitivity difference between classifiers with transformed decision boundaries to detect adversarial examples.

Findings

01

SID outperforms state-of-the-art methods like LID, MD, and FS.

02

SID shows superior generalization in detecting small perturbation adversarial examples.

03

Experimental validation on ResNet and VGG demonstrates SID's effectiveness.

Abstract

Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs), which are maliciously designed to cause dramatic model output errors. In this work, we reveal that normal examples (NEs) are insensitive to the fluctuations occurring at the highly-curved region of the decision boundary, while AEs typically designed over one single domain (mostly spatial domain) exhibit exorbitant sensitivity on such fluctuations. This phenomenon motivates us to design another classifier (called dual classifier) with transformed decision boundary, which can be collaboratively used with the original classifier (called primal classifier) to detect AEs, by virtue of the sensitivity inconsistency. When comparing with the state-of-the-art algorithms based on Local Intrinsic Dimensionality (LID), Mahalanobis Distance (MD), and Feature Squeezing (FS), our proposed Sensitivity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TooTouch/SID
pytorch

Videos

Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Integrated Circuits and Semiconductor Failure Analysis

MethodsResidual Connection · Max Pooling · Average Pooling · Residual Block · Kaiming Initialization · Global Average Pooling · Dense Connections · Softmax · Autoencoders · Batch Normalization