Do Metrics for Counterfactual Explanations Align with User Perception?
Felix Liedeker, Basil Ell, Philipp Cimiano, Christoph D\"using

TL;DR
This study empirically evaluates whether common algorithmic metrics for counterfactual explanations align with human perceptions of explanation quality, revealing weak correlations and dataset dependency, and highlighting the need for human-centered evaluation methods.
Contribution
It provides an empirical comparison between algorithmic metrics and human judgments, demonstrating the limitations of current metrics in capturing user-perceived explanation quality.
Findings
Weak correlation between metrics and human ratings
Dataset-dependent variability in metric effectiveness
Adding more metrics does not reliably improve prediction of human judgments
Abstract
Explainability is widely regarded as essential for trustworthy artificial intelligence systems. However, the metrics commonly used to evaluate counterfactual explanations are algorithmic evaluation metrics that are rarely validated against human judgments of explanation quality. This raises the question of whether such metrics meaningfully reflect user perceptions. We address this question through an empirical study that directly compares algorithmic evaluation metrics with human judgments across three datasets. Participants rated counterfactual explanations along multiple dimensions of perceived quality, which we relate to a comprehensive set of standard counterfactual metrics. We analyze both individual relationships and the extent to which combinations of metrics can predict human assessments. Our results show that correlations between algorithmic metrics and human ratings are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Ethics and Social Impacts of AI
