Gradient-based Counterfactual Explanations using Tractable Probabilistic Models
Xiaoting Shao, Kristian Kersting

TL;DR
This paper introduces a fast, gradient-based method for generating realistic counterfactual explanations in machine learning using tractable probabilistic models, avoiding complex optimization.
Contribution
It presents a novel approach that uses only two gradient computations to produce counterfactuals, improving speed and realism over existing methods.
Findings
Method is faster than traditional optimization-based approaches.
Produces more realistic counterfactual examples.
Empirical results show clear advantages over prior methods.
Abstract
Counterfactual examples are an appealing class of post-hoc explanations for machine learning models. Given input of class , its counterfactual is a contrastive example of another class . Current approaches primarily solve this task by a complex optimization: define an objective function based on the loss of the counterfactual outcome with hard or soft constraints, then optimize this function as a black-box. This "deep learning" approach, however, is rather slow, sometimes tricky, and may result in unrealistic counterfactual examples. In this work, we propose a novel approach to deal with these problems using only two gradient computations based on tractable probabilistic models. First, we compute an unconstrained counterfactual of to induce the counterfactual outcome . Then, we adapt to higher density regions, resulting in .…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Materials Science · Machine Learning in Healthcare
