White Box Methods for Explanations of Convolutional Neural Networks in Image Classification Tasks
Meghna P Ayyar, Jenny Benois-Pineau, Akka Zemmari

TL;DR
This paper reviews and classifies white box explanation methods for CNNs in image classification, focusing on how internal architecture information can be used to generate pixel importance maps.
Contribution
It provides a comprehensive taxonomy of white box explanation methods for CNNs, including their assumptions and implementations, to aid researchers in selecting suitable techniques.
Findings
Classified white box explanation methods based on assumptions and implementations.
Provided a detailed overview of methods for creating pixel importance maps.
Facilitated better comparison and selection of explanation techniques.
Abstract
In recent years, deep learning has become prevalent to solve applications from multiple domains. Convolutional Neural Networks (CNNs) particularly have demonstrated state of the art performance for the task of image classification. However, the decisions made by these networks are not transparent and cannot be directly interpreted by a human. Several approaches have been proposed to explain to understand the reasoning behind a prediction made by a network. In this paper, we propose a topology of grouping these methods based on their assumptions and implementations. We focus primarily on white box methods that leverage the information of the internal architecture of a network to explain its decision. Given the task of image classification and a trained CNN, this work aims to provide a comprehensive and detailed overview of a set of methods that can be used to create explanation maps for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
