TL;DR
Grad-CAM is a versatile technique that produces visual explanations for CNN decisions by highlighting important image regions, applicable across various models and tasks, improving interpretability, robustness, and trust.
Contribution
We introduce Grad-CAM, a general method for visual explanations in CNNs that requires no architectural changes and enhances interpretability across multiple applications.
Findings
Grad-CAM provides accurate localization of important regions in images.
It improves understanding of model failure modes and robustness to adversarial images.
Grad-CAM aids in establishing user trust and identifying dataset biases.
Abstract
We propose a technique for producing "visual explanations" for decisions from a large class of CNN-based models, making them more transparent. Our approach - Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting important regions in the image for predicting the concept. Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers, (2) CNNs used for structured outputs, (3) CNNs used in tasks with multimodal inputs or reinforcement learning, without any architectural changes or re-training. We combine Grad-CAM with fine-grained visualizations to create a high-resolution class-discriminative visualization and apply it to off-the-shelf image classification, captioning, and visual question answering (VQA) models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods10 Ways to Speak to a Human at Qatar Airways: A Step-by-Step Guide
