A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data
Raphael Mazzine, David Martens

TL;DR
This paper presents a comprehensive benchmarking study and a new framework for evaluating counterfactual explanation algorithms on tabular data, providing insights into their performance across diverse datasets.
Contribution
It introduces a novel framework and a set of metrics for testing and comparing counterfactual generation methods, along with extensive benchmarking results.
Findings
No single best algorithm; performance varies by dataset and context.
Certain algorithms perform better on specific dataset types.
Benchmarking results guide practitioners in selecting suitable methods.
Abstract
Counterfactual explanations are viewed as an effective way to explain machine learning predictions. This interest is reflected by a relatively young literature with already dozens of algorithms aiming to generate such explanations. These algorithms are focused on finding how features can be modified to change the output classification. However, this rather general objective can be achieved in different ways, which brings about the need for a methodology to test and benchmark these algorithms. The contributions of this work are manifold: First, a large benchmarking study of 10 algorithmic approaches on 22 tabular datasets is performed, using 9 relevant evaluation metrics. Second, the introduction of a novel, first of its kind, framework to test counterfactual generation algorithms. Third, a set of objective metrics to evaluate and compare counterfactual results. And finally, insight from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
