Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals
Sainyam Galhotra, Romila Pradhan, Babak Salimi

TL;DR
This paper introduces LEWIS, a causality-based framework using probabilistic contrastive counterfactuals to generate effective, understandable explanations and recourse for black-box AI systems, outperforming existing methods.
Contribution
LEWIS is the first system to provide provably effective, scalable explanations and recourse at multiple levels without assuming internal model details, based solely on input-output data.
Findings
LEWIS outperforms LIME and SHAP in generating human-understandable explanations.
LEWIS provides effective recourse solutions that are scalable and correct.
Empirical results on real-world datasets validate LEWIS's effectiveness and scalability.
Abstract
There has been a recent resurgence of interest in explainable artificial intelligence (XAI) that aims to reduce the opaqueness of AI-based decision-making systems, allowing humans to scrutinize and trust them. Prior work in this context has focused on the attribution of responsibility for an algorithm's decisions to its inputs wherein responsibility is typically approached as a purely associational concept. In this paper, we propose a principled causality-based approach for explaining black-box decision-making systems that addresses limitations of existing methods in XAI. At the core of our framework lies probabilistic contrastive counterfactuals, a concept that can be traced back to philosophical, cognitive, and social foundations of theories on how humans generate and select explanations. We show how such counterfactuals can quantify the direct and indirect influences of a variable on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsCounterfactuals Explanations · Local Interpretable Model-Agnostic Explanations · Shapley Additive Explanations
