Attack to Fool and Explain Deep Networks

Naveed Akhtar; Muhammad A. A. K. Jalwana; Mohammed Bennamoun; Ajmal; Mian

arXiv:2106.10606·cs.CV·June 22, 2021

Attack to Fool and Explain Deep Networks

Naveed Akhtar, Muhammad A. A. K. Jalwana, Mohammed Bennamoun, Ajmal, Mian

PDF

Open Access

TL;DR

This paper introduces a novel adversarial attack that not only fools deep visual models but also reveals human-meaningful patterns in perturbations, enabling interpretation and manipulation of model understanding of semantic concepts.

Contribution

The paper presents a new attack method that uncovers geometric patterns in adversarial perturbations and uses it as an interpretability tool for deep visual representations.

Findings

01

Adversarial perturbations contain human-meaningful geometric patterns.

02

The attack reveals insights into deep model decision boundaries.

03

Perturbations can be used for image generation, inpainting, and manipulation.

Abstract

Deep visual models are susceptible to adversarial perturbations to inputs. Although these signals are carefully crafted, they still appear noise-like patterns to humans. This observation has led to the argument that deep visual representation is misaligned with human perception. We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations. We first propose an attack that fools a network to confuse a whole category of objects (source class) with a target label. Our attack also limits the unintended fooling by samples from non-sources classes, thereby circumscribing human-defined semantic notions for network fooling. We show that the proposed attack not only leads to the emergence of regular geometric patterns in the perturbations, but also reveals insightful information about the decision boundaries of deep models. Exploring this phenomenon further, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Anomaly Detection Techniques and Applications

MethodsInpainting