Saliency Methods for Explaining Adversarial Attacks

Jindong Gu; Volker Tresp

arXiv:1908.08413·cs.CV·October 22, 2019·21 cites

Saliency Methods for Explaining Adversarial Attacks

Jindong Gu, Volker Tresp

PDF

Open Access

TL;DR

This paper investigates the effectiveness of saliency methods in explaining adversarial attacks on neural networks, proposing an enhancement to Guided Backpropagation that improves interpretability and state-of-the-art performance.

Contribution

It introduces a simple enhancement to Guided Backpropagation, significantly improving its ability to explain adversarial misclassifications in neural networks.

Findings

01

Enhanced GuidedBP outperforms previous saliency methods in explaining adversarial attacks.

02

Saliency maps with the proposed method contain more class-discriminative information.

03

The approach is computationally efficient and improves interpretability of adversarial examples.

Abstract

The classification decisions of neural networks can be misled by small imperceptible perturbations. This work aims to explain the misled classifications using saliency methods. The idea behind saliency methods is to explain the classification decisions of neural networks by creating so-called saliency maps. Unfortunately, a number of recent publications have shown that many of the proposed saliency methods do not provide insightful explanations. A prominent example is Guided Backpropagation (GuidedBP), which simply performs (partial) image recovery. However, our numerical analysis shows the saliency maps created by GuidedBP do indeed contain class-discriminative information. We propose a simple and efficient way to enhance the saliency maps. The proposed enhanced GuidedBP shows the state-of-the-art performance to explain adversary classifications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications