Revisiting Model's Uncertainty and Confidences for Adversarial Example   Detection

Ahmed Aldahdooh; Wassim Hamidouche; and Olivier D\'eforges

arXiv:2103.05354·cs.CR·June 22, 2021

Revisiting Model's Uncertainty and Confidences for Adversarial Example Detection

Ahmed Aldahdooh, Wassim Hamidouche, and Olivier D\'eforges

PDF

Open Access 1 Repo

TL;DR

This paper introduces SFAD, an unsupervised ensemble method that improves adversarial example detection by leveraging model uncertainty and feature map analysis, outperforming existing techniques especially against black- and gray-box attacks.

Contribution

The paper proposes SFAD, a novel unsupervised ensemble detection mechanism using SelectiveNet and feature maps, achieving superior robustness against various adversarial attacks.

Findings

01

Outperforms state-of-the-art detection methods against black- and gray-box attacks.

02

Achieves comparable performance to top methods against white-box attacks.

03

Fully robust against High Confidence Attacks on MNIST, partially robust on CIFAR10.

Abstract

Security-sensitive applications that rely on Deep Neural Networks (DNNs) are vulnerable to small perturbations that are crafted to generate Adversarial Examples(AEs). The AEs are imperceptible to humans and cause DNN to misclassify them. Many defense and detection techniques have been proposed. Model's confidences and Dropout, as a popular way to estimate the model's uncertainty, have been used for AE detection but they showed limited success against black- and gray-box attacks. Moreover, the state-of-the-art detection techniques have been designed for specific attacks or broken by others, need knowledge about the attacks, are not consistent, increase model parameters overhead, are time-consuming, or have latency in inference time. To trade off these factors, we revisit the model's uncertainty and confidences and propose a novel unsupervised ensemble AE detection mechanism that 1) uses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aldahdooh/SFAD
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsAutoencoders · Dropout