Evaluating counterfactual explanations using Pearl's counterfactual method
Bevan I. Smith

TL;DR
This paper introduces a method to evaluate counterfactual explanations by applying Judea Pearl's causal approach, revealing that many existing CEs conflict with true causal structures, emphasizing the importance of causal understanding.
Contribution
It is the first to apply Pearl's counterfactual method to evaluate machine learning-generated explanations, highlighting the impact of causal structure on explanation validity.
Findings
30% of CEs conflicted with Pearl's method
Causal structure significantly affects counterfactual explanations
Highlights need for causal awareness in explanation generation
Abstract
Counterfactual explanations (CEs) are methods for generating an alternative scenario that produces a different desirable outcome. For example, if a student is predicted to fail a course, then counterfactual explanations can provide the student with alternate ways so that they would be predicted to pass. The applications are many. However, CEs are currently generated from machine learning models that do not necessarily take into account the true causal structure in the data. By doing this, bias can be introduced into the CE quantities. I propose in this study to test the CEs using Judea Pearl's method of computing counterfactuals which has thus far, surprisingly, not been seen in the counterfactual explanation (CE) literature. I furthermore evaluate these CEs on three different causal structures to show how the true underlying causal structure affects the CEs that are generated. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference
Methodsfail · Counterfactuals Explanations · Test
