How Can One Choose the Best CAM-Based Explainability Method for a CNN Model?
Daniel da Silva Costa, Pedro Nuno de Souza Moura, Adriana C. F. Alvim

TL;DR
This paper investigates how to select the most human-like CAM-based explainability method for CNNs by comparing saliency maps with human perception using various metrics.
Contribution
It proposes a new approach to evaluate explainability methods based on similarity metrics aligned with human perception, validated through experiments with ImageNet data.
Findings
Manhattan and Correlation metrics best match human perception.
LayerCAM, Score-CAM, and IS-CAM are the top explainability methods.
The proposed method effectively identifies the most human-like saliency maps.
Abstract
In recent years, several advances have been observed in Deep Learning with surprising results. Models in this area have been increasingly used in numerous applications, including those sensitive to human life, which require clear explanations and justifications. Various explainability methods have been proposed, but not many metrics to evaluate these methods. The most commonly used metric is the Intersection over Union (IoU). However, due to the characteristics of the results of the explainability methods, called saliency maps, which do not have a known shape, we hypothesise that there must be a better metric that allows one to find an explainability method that produces results that best resemble the human perception. We propose using different metrics to assess the similarity between human perception and the explanation saliency maps to find a better metric. An investigation was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
