Grad-CAM: Why did you say that?

Ramprasaath R Selvaraju; Abhishek Das; Ramakrishna Vedantam; Michael; Cogswell; Devi Parikh; Dhruv Batra

arXiv:1611.07450·stat.ML·January 26, 2017·326 cites

Grad-CAM: Why did you say that?

Ramprasaath R Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael, Cogswell, Devi Parikh, Dhruv Batra

PDF

Open Access 2 Repos

TL;DR

Grad-CAM introduces a visualization technique for CNNs that highlights important input regions for predictions, enhancing interpretability and trust in models like image captioning and VQA.

Contribution

The paper presents Grad-CAM, a novel method combining gradient information with class activation maps to produce high-resolution, class-discriminative visual explanations.

Findings

01

Grad-CAM effectively localizes important regions for CNN predictions.

02

Guided Grad-CAM produces high-resolution visualizations.

03

Visual explanations correlate well with occlusion maps and improve interpretability.

Abstract

We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations. Our approach, called Gradient-weighted Class Activation Mapping (Grad-CAM), uses class-specific gradient information to localize important regions. These localizations are combined with existing pixel-space visualizations to create a novel high-resolution and class-discriminative visualization called Guided Grad-CAM. These methods help better understand CNN-based models, including image captioning and visual question answering (VQA) models. We evaluate our visual explanations by measuring their ability to discriminate between classes, to inspire trust in humans, and their correlation with occlusion maps. Grad-CAM provides a new way to understand CNN-based models. We have released code, an online…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning