Very fast, approximate counterfactual explanations for decision forests
Miguel \'A. Carreira-Perpi\~n\'an, Suryabhan Singh Hada

TL;DR
This paper introduces a fast, approximate method for generating counterfactual explanations for decision forests by restricting the search to data-populated regions, enabling quick, realistic solutions suitable for interactive use.
Contribution
The authors propose a novel approach that constrains counterfactual search to data-rich regions, simplifying the optimization to a nearest-neighbor problem, which is scalable and more realistic.
Findings
Method scales to large forests and high-dimensional data
Solutions are found very quickly, enabling interactive use
Generated counterfactuals are more realistic and data-driven
Abstract
We consider finding a counterfactual explanation for a classification or regression forest, such as a random forest. This requires solving an optimization problem to find the closest input instance to a given instance for which the forest outputs a desired value. Finding an exact solution has a cost that is exponential on the number of leaves in the forest. We propose a simple but very effective approach: we constrain the optimization to only those input space regions defined by the forest that are populated by actual data points. The problem reduces to a form of nearest-neighbor search using a certain distance on a certain dataset. This has two advantages: first, the solution can be found very quickly, scaling to large forests and high-dimensional data, and enabling interactive use. Second, the solution found is more likely to be realistic in that it is guided towards high-density…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
