TL;DR
This paper introduces CNN Fixations, a visualization method that reveals the key image regions influencing CNN predictions by unraveling the forward pass, providing transparency without altering the network or requiring additional training.
Contribution
The proposed approach offers a generic, architecture-agnostic way to visualize discriminative image regions in CNNs by analyzing feature dependencies during the forward pass.
Findings
Effectively localizes important image regions across various CNN architectures.
Applicable to multiple vision tasks like object recognition and caption generation.
Requires no modifications or additional training of the CNN models.
Abstract
Deep convolutional neural networks (CNN) have revolutionized various fields of vision research and have seen unprecedented adoption for multiple tasks such as classification, detection, captioning, etc. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this work, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze variety of CNN based models trained for vision applications such as object recognition and caption generation. Unlike existing methods, we achieve this via unraveling the forward pass operation. Proposed method exploits feature dependencies across the layer hierarchy and uncovers the discriminative image locations that guide the network's predictions. We name these locations CNN-Fixations, loosely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
