Generalizable Adversarial Examples Detection Based on Bi-model Decision   Mismatch

Jo\~ao Monteiro; Isabela Albuquerque; Zahid Akhtar; Tiago H. Falk

arXiv:1802.07770·cs.CV·April 24, 2019

Generalizable Adversarial Examples Detection Based on Bi-model Decision Mismatch

Jo\~ao Monteiro, Isabela Albuquerque, Zahid Akhtar, Tiago H. Falk

PDF

TL;DR

This paper proposes a model-agnostic adversarial example detection method using bi-model decision mismatch, demonstrating high detection rates across various attack types without prior attack knowledge.

Contribution

It introduces a novel detection framework based on decision layer features from independent models, effective against multiple attack methods without needing attack-specific training.

Findings

01

Achieves over 90% detection rate in white-box attacks.

02

Generalizes well to unseen attack types.

03

Works with unmodified off-the-shelf models.

Abstract

Modern applications of artificial neural networks have yielded remarkable performance gains in a wide range of tasks. However, recent studies have discovered that such modelling strategy is vulnerable to Adversarial Examples, i.e. examples with subtle perturbations often too small and imperceptible to humans, but that can easily fool neural networks. Defense techniques against adversarial examples have been proposed, but ensuring robust performance against varying or novel types of attacks remains an open problem. In this work, we focus on the detection setting, in which case attackers become identifiable while models remain vulnerable. Particularly, we employ the decision layer of independently trained models as features for posterior detection. The proposed framework does not require any prior knowledge of adversarial examples generation techniques, and can be directly employed along…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.