FOCUS: Flexible Optimizable Counterfactual Explanations for Tree Ensembles
Ana Lucic, Harrie Oosterhuis, Hinda Haned, Maarten de Rijke

TL;DR
This paper introduces FOCUS, a method for generating counterfactual explanations for tree ensemble models by using probabilistic approximations and gradient-based optimization, improving closeness to original instances.
Contribution
FOCUS extends counterfactual explanation techniques to non-differentiable models like tree ensembles through probabilistic approximations, enabling effective gradient-based optimization.
Findings
Counterfactuals are closer to original instances than existing methods.
The approach effectively handles non-differentiable models.
It enhances interpretability for tree ensemble models.
Abstract
Model interpretability has become an important problem in machine learning (ML) due to the increased effect that algorithmic decisions have on humans. Counterfactual explanations can help users understand not only why ML models make certain decisions, but also how these decisions can be changed. We frame the problem of finding counterfactual explanations as a gradient-based optimization task and extend previous work that could only be applied to differentiable models. In order to accommodate non-differentiable models such as tree ensembles, we use probabilistic model approximations in the optimization framework. We introduce an approximation technique that is effective for finding counterfactual explanations for predictions of the original model and show that our counterfactual examples are significantly closer to the original instances than those produced by other methods specifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsInterpretability
