STEEX: Steering Counterfactual Explanations with Semantics
Paul Jacob, \'Eloi Zablocki, H\'edi Ben-Younes, Micka\"el Chen,, Patrick P\'erez, Matthieu Cord

TL;DR
This paper introduces STEEX, a novel framework for generating plausible, sparse, and region-guided counterfactual explanations for complex, high-quality images using semantic-to-image models, enhancing interpretability in safety-critical applications.
Contribution
It presents a new generative counterfactual explanation method that produces realistic modifications and allows user-guided, region-specific explanations for complex scenes.
Findings
Effective on high-quality portrait datasets
Produces plausible and sparse counterfactuals
Enables user-guided, region-specific explanations
Abstract
As deep learning models are increasingly used in safety-critical applications, explainability and trustworthiness become major concerns. For simple images, such as low-resolution face portraits, synthesizing visual counterfactual explanations has recently been proposed as a way to uncover the decision mechanisms of a trained classification model. In this work, we address the problem of producing counterfactual explanations for high-quality images and complex scenes. Leveraging recent semantic-to-image models, we propose a new generative counterfactual explanation framework that produces plausible and sparse modifications which preserve the overall scene structure. Furthermore, we introduce the concept of "region-targeted counterfactual explanations", and a corresponding framework, where users can guide the generation of counterfactuals by specifying a set of semantic regions of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
MethodsCounterfactuals Explanations
