On the Definition and Detection of Cherry-Picking in Counterfactual Explanations

James Hinns; Sofie Goethals; Stephan Van der Veeken; Theodoros Evgeniou; and David Martens

arXiv:2601.04977·cs.LG·January 9, 2026

On the Definition and Detection of Cherry-Picking in Counterfactual Explanations

James Hinns, Sofie Goethals, Stephan Van der Veeken, Theodoros Evgeniou, and David Martens

PDF

Open Access

TL;DR

This paper investigates the challenge of detecting cherry-picking in counterfactual explanations, revealing that even with full access, manipulation is hard to identify due to the inherent variability and flexibility in explanation generation.

Contribution

It formally defines cherry-picking in counterfactual explanations, analyzes detection limits under various access levels, and empirically shows the difficulty of distinguishing manipulated explanations from genuine ones.

Findings

01

Detection of cherry-picking is extremely limited in practice.

02

Variability in explanations often exceeds effects of cherry-picking.

03

Standard quality metrics cannot reliably identify manipulated explanations.

Abstract

Counterfactual explanations are widely used to communicate how inputs must change for a model to alter its prediction. For a single instance, many valid counterfactuals can exist, which leaves open the possibility for an explanation provider to cherry-pick explanations that better suit a narrative of their choice, highlighting favourable behaviour and withholding examples that reveal problematic behaviour. We formally define cherry-picking for counterfactual explanations in terms of an admissible explanation space, specified by the generation procedure, and a utility function. We then study to what extent an external auditor can detect such manipulation. Considering three levels of access to the explanation process: full procedural access, partial procedural access, and explanation-only access, we show that detection is extremely limited in practice. Even with full procedural access,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Adversarial Robustness in Machine Learning