Grad-CAM: Visual Explanations from Deep Networks via Gradient-based   Localization

Ramprasaath R. Selvaraju; Michael Cogswell; Abhishek Das; Ramakrishna; Vedantam; Devi Parikh; Dhruv Batra

arXiv:1610.02391·cs.CV·December 4, 2019

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna, Vedantam, Devi Parikh, Dhruv Batra

PDF

5 Repos

TL;DR

Grad-CAM is a versatile technique that produces visual explanations for CNN decisions by highlighting important image regions, applicable across various models and tasks, improving interpretability, robustness, and trust.

Contribution

We introduce Grad-CAM, a general method for visual explanations in CNNs that requires no architectural changes and enhances interpretability across multiple applications.

Findings

01

Grad-CAM provides accurate localization of important regions in images.

02

It improves understanding of model failure modes and robustness to adversarial images.

03

Grad-CAM aids in establishing user trust and identifying dataset biases.

Abstract

We propose a technique for producing "visual explanations" for decisions from a large class of CNN-based models, making them more transparent. Our approach - Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting important regions in the image for predicting the concept. Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers, (2) CNNs used for structured outputs, (3) CNNs used in tasks with multimodal inputs or reinforcement learning, without any architectural changes or re-training. We combine Grad-CAM with fine-grained visualizations to create a high-resolution class-discriminative visualization and apply it to off-the-shelf image classification, captioning, and visual question answering (VQA) models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods10 Ways to Speak to a Human at Qatar Airways: A Step-by-Step Guide