Evaluating Explanation Without Ground Truth in Interpretable Machine Learning
Fan Yang, Mengnan Du, Xia Hu

TL;DR
This paper discusses the challenge of evaluating explanations in Interpretable Machine Learning without ground truth, proposing a formal framework and reviewing current methodologies for assessing explanation quality.
Contribution
It provides a formal definition of explanation evaluation, reviews existing methods across three aspects, and introduces a unified framework adaptable to various practical scenarios.
Findings
Summarizes three key aspects of explanation quality: generalizability, fidelity, and persuasibility.
Reviews state-of-the-art methodologies for each aspect in different tasks.
Identifies open problems and limitations in current evaluation techniques.
Abstract
Interpretable Machine Learning (IML) has become increasingly important in many real-world applications, such as autonomous cars and medical diagnosis, where explanations are significantly preferred to help people better understand how machine learning systems work and further enhance their trust towards systems. However, due to the diversified scenarios and subjective nature of explanations, we rarely have the ground truth for benchmark evaluation in IML on the quality of generated explanations. Having a sense of explanation quality not only matters for assessing system boundaries, but also helps to realize the true benefits to human users in practical settings. To benchmark the evaluation in IML, in this article, we rigorously define the problem of evaluating explanations, and systematically review the existing efforts from state-of-the-arts. Specifically, we summarize three general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
