Evaluation of Explanation Methods of AI -- CNNs in Image Classification Tasks with Reference-based and No-reference Metrics
A. Zhukov, J. Benois-Pineau, R. Giot

TL;DR
This paper evaluates the quality of CNN explanation methods in image classification using both reference-based and no-reference metrics, demonstrating the effectiveness of stability metrics when ground truth explanations are unavailable.
Contribution
It generalizes evaluation methodologies for CNN explainers and compares reference and no-reference metrics, validating the use of stability metrics in absence of ground truth.
Findings
Stability metric aligns with reference-based metrics under input degradation.
Reference-based metrics include Pearson correlation and similarity with ground truth.
Proposed evaluation framework applies to explainers like FEM, MLFEM, and Grad-CAM.
Abstract
The most popular methods in AI-machine learning paradigm are mainly black boxes. This is why explanation of AI decisions is of emergency. Although dedicated explanation tools have been massively developed, the evaluation of their quality remains an open research question. In this paper, we generalize the methodologies of evaluation of post-hoc explainers of CNNs' decisions in visual classification tasks with reference and no-reference based metrics. We apply them on our previously developed explainers (FEM, MLFEM), and popular Grad-CAM. The reference-based metrics are Pearson correlation coefficient and Similarity computed between the explanation map and its ground truth represented by a Gaze Fixation Density Map obtained with a psycho-visual experiment. As a no-reference metric, we use stability metric, proposed by Alvarez-Melis and Jaakkola. We study its behaviour, consensus with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Brain Tumor Detection and Classification
