On Counterfactual Explanations under Predictive Multiplicity
Martin Pawelczyk, Klaus Broelemann, Gjergji Kasneci

TL;DR
This paper investigates the impact of predictive multiplicity on counterfactual explanations, deriving bounds and comparing sparse and data support methods, revealing trade-offs in robustness and explanation costs.
Contribution
It introduces a theoretical upper bound for counterfactual explanation costs considering predictive multiplicity and empirically compares sparse and data support methods.
Findings
Data support methods are more robust to model multiplicity.
Sparse methods have lower explanation costs under a fixed model.
Counterfactual explanations are not always best when sparse.
Abstract
Counterfactual explanations are usually obtained by identifying the smallest change made to an input to change a prediction made by a fixed model (hereafter called sparse methods). Recent work, however, has revitalized an old insight: there often does not exist one superior solution to a prediction problem with respect to commonly used measures of interest (e.g. error rate). In fact, often multiple different classifiers give almost equal solutions. This phenomenon is known as predictive multiplicity (Breiman, 2001; Marx et al., 2019). In this work, we derive a general upper bound for the costs of counterfactual explanations under predictive multiplicity. Most notably, it depends on a discrepancy notion between two classifiers, which describes how differently they treat negatively predicted individuals. We then compare sparse and data support approaches empirically on real-world data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Bayesian Modeling and Causal Inference
