Evaluation Cards for XAI Metrics
Rokas Gipi\v{s}kis, Olga Kurasova

TL;DR
This paper introduces the XAI Evaluation Card, a standardized documentation template to improve transparency, consistency, and accountability in evaluating explainable AI methods.
Contribution
It proposes a new documentation template for XAI evaluation metrics to address current reporting inconsistencies and validation issues.
Findings
The Evaluation Card covers target properties, assumptions, validation, risks, and failure cases.
Adopting the template can reduce evaluation fragmentation and support meta-analysis.
It aims to improve accountability in XAI research.
Abstract
The evaluation of explainable AI (XAI) methods is affected by a lack of standardization. Metrics are inconsistently defined, incompletely reported, and rarely validated against common baselines. In this paper, we identify transparency of evaluation reporting as a central, under-addressed problem. We propose the XAI Evaluation Card, a documentation template analogous to model cards, designed to accompany any study that introduces an XAI evaluation metric. The card covers explicit declaration of target properties, grounding levels, metric assumptions, validation evidence, gaming risks, and known failure cases. We argue that adopting this template as a community norm would reduce evaluation fragmentation, support meta-analysis, and improve accountability in XAI research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
