Truthful Meta-Explanations for Local Interpretability of Machine Learning Models
Ioannis Mollas, Nick Bassiliades, Grigorios Tsoumakas

TL;DR
This paper introduces a local meta-explanation method for machine learning models that leverages a faithfulness-based truthfulness metric to improve interpretability in high-stakes applications.
Contribution
It presents a novel meta-explanation technique built on the truthfulness metric, enhancing the selection of interpretability methods for complex ML models.
Findings
The proposed technique effectively identifies truthful explanations.
The truthfulness metric correlates well with explanation quality.
Experimental results validate the approach's usefulness.
Abstract
Automated Machine Learning-based systems' integration into a wide range of tasks has expanded as a result of their performance and speed. Although there are numerous advantages to employing ML-based systems, if they are not interpretable, they should not be used in critical, high-risk applications where human lives are at risk. To address this issue, researchers and businesses have been focusing on finding ways to improve the interpretability of complex ML systems, and several such methods have been developed. Indeed, there are so many developed techniques that it is difficult for practitioners to choose the best among them for their applications, even when using evaluation metrics. As a result, the demand for a selection tool, a meta-explanation technique based on a high-quality evaluation metric, is apparent. In this paper, we present a local meta-explanation technique which builds on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
