BEExAI: Benchmark to Evaluate Explainable AI
Samuel Sithakoul, Sara Meftah, Cl\'ement Feutry

TL;DR
BEExAI is a benchmark tool designed to systematically evaluate and compare the effectiveness of various post-hoc explainable AI methods using standardized metrics, addressing the lack of cohesive evaluation approaches.
Contribution
The paper introduces BEExAI, a comprehensive benchmark framework for large-scale comparison of post-hoc XAI methods with standardized evaluation metrics.
Findings
Provides a unified platform for evaluating XAI methods
Enables large-scale comparison of explanation quality
Facilitates development of more reliable explainability techniques
Abstract
Recent research in explainability has given rise to numerous post-hoc attribution methods aimed at enhancing our comprehension of the outputs of black-box machine learning models. However, evaluating the quality of explanations lacks a cohesive approach and a consensus on the methodology for deriving quantitative metrics that gauge the efficacy of explainability post-hoc attribution methods. Furthermore, with the development of increasingly complex deep learning models for diverse data applications, the need for a reliable way of measuring the quality and correctness of explanations is becoming critical. We address this by proposing BEExAI, a benchmark tool that allows large-scale comparison of different post-hoc XAI methods, employing a set of selected evaluation metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
MethodsSparse Evolutionary Training
