Measurable Counterfactual Local Explanations for Any Classifier
Adam White, Artur d'Avila Garcez

TL;DR
This paper introduces CLEAR, a method for generating local, counterfactual explanations for classifiers that emphasizes fidelity to the decision boundary, outperforming LIME in accuracy across multiple case studies.
Contribution
The paper presents a novel approach, CLEAR, that creates counterfactual explanations with higher fidelity than existing methods like LIME, by integrating boundary-based fidelity measures.
Findings
CLEAR achieves over 45% higher fidelity than LIME in case studies.
Counterfactual explanations effectively identify minimal changes to flip predictions.
Fidelity measurement based on decision boundary distances improves explanation quality.
Abstract
We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if 'things had been different'. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks' knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates w-counterfactual explanations that state minimum changes necessary to flip a prediction's classification. CLEAR then builds local regression models, using the w-counterfactuals to measure and improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
MethodsLocal Interpretable Model-Agnostic Explanations
