Attack to Fool and Explain Deep Networks
Naveed Akhtar, Muhammad A. A. K. Jalwana, Mohammed Bennamoun, Ajmal, Mian

TL;DR
This paper introduces a novel adversarial attack that not only fools deep visual models but also reveals human-meaningful patterns in perturbations, enabling interpretation and manipulation of model understanding of semantic concepts.
Contribution
The paper presents a new attack method that uncovers geometric patterns in adversarial perturbations and uses it as an interpretability tool for deep visual representations.
Findings
Adversarial perturbations contain human-meaningful geometric patterns.
The attack reveals insights into deep model decision boundaries.
Perturbations can be used for image generation, inpainting, and manipulation.
Abstract
Deep visual models are susceptible to adversarial perturbations to inputs. Although these signals are carefully crafted, they still appear noise-like patterns to humans. This observation has led to the argument that deep visual representation is misaligned with human perception. We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations. We first propose an attack that fools a network to confuse a whole category of objects (source class) with a target label. Our attack also limits the unintended fooling by samples from non-sources classes, thereby circumscribing human-defined semantic notions for network fooling. We show that the proposed attack not only leads to the emergence of regular geometric patterns in the perturbations, but also reveals insightful information about the decision boundaries of deep models. Exploring this phenomenon further, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Anomaly Detection Techniques and Applications
MethodsInpainting
