Generalizing Adversarial Explanations with Grad-CAM
Tanmay Chakraborty, Utkarsh Trehan, Khawla Mallat, and Jean-Luc, Dugelay

TL;DR
This paper extends Grad-CAM from example-based explanations to a global model behavior explanation using new metrics, enabling better understanding of CNNs under adversarial attacks.
Contribution
Introduces two novel metrics, MOD and VID, to generalize Grad-CAM for explaining overall CNN behavior and robustness against adversarial attacks.
Findings
Grad-CAM heatmaps shift under adversarial attacks across multiple CNN models.
The proposed metrics effectively capture changes in model decision regions.
Method aids in understanding and explaining black box CNN models in adversarial contexts.
Abstract
Gradient-weighted Class Activation Mapping (Grad- CAM), is an example-based explanation method that provides a gradient activation heat map as an explanation for Convolution Neural Network (CNN) models. The drawback of this method is that it cannot be used to generalize CNN behaviour. In this paper, we present a novel method that extends Grad-CAM from example-based explanations to a method for explaining global model behaviour. This is achieved by introducing two new metrics, (i) Mean Observed Dissimilarity (MOD) and (ii) Variation in Dissimilarity (VID), for model generalization. These metrics are computed by comparing a Normalized Inverted Structural Similarity Index (NISSIM) metric of the Grad-CAM generated heatmap for samples from the original test set and samples from the adversarial test set. For our experiment, we study adversarial attacks on deep models such as VGG16, ResNet50,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
MethodsHeatmap · Convolution
