Text-to-Image Models for Counterfactual Explanations: a Black-Box Approach
Guillaume Jeanneret, Lo\"ic Simon, Fr\'ed\'eric Jurie

TL;DR
This paper introduces TIME, a black-box method using text-to-image models to generate counterfactual explanations for classifiers, requiring only the image and prediction, without access to model internals.
Contribution
TIME is a novel black-box approach that leverages bias injection into Stable Diffusion to produce counterfactual explanations without needing classifier details.
Findings
TIME generates effective counterfactuals comparable to existing methods.
The approach requires only image and prediction, not model parameters.
Empirical results validate the method's robustness and effectiveness.
Abstract
This paper addresses the challenge of generating Counterfactual Explanations (CEs), involving the identification and modification of the fewest necessary features to alter a classifier's prediction for a given image. Our proposed method, Text-to-Image Models for Counterfactual Explanations (TIME), is a black-box counterfactual technique based on distillation. Unlike previous methods, this approach requires solely the image and its prediction, omitting the need for the classifier's structure, parameters, or gradients. Before generating the counterfactuals, TIME introduces two distinct biases into Stable Diffusion in the form of textual embeddings: the context bias, associated with the image's structure, and the class bias, linked to class-specific features learned by the target classifier. After learning these biases, we find the optimal latent code applying the classifier's predicted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Text-to-Image Models for Counterfactual Explanations: A Black-Box Approach· youtube
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning in Materials Science
MethodsDiffusion
