Rethinking Visual Counterfactual Explanations Through Region Constraint
Bartlomiej Sobieski, Jakub Grzywaczewski, Bartlomiej Sadlej, Matthew, Tivnan, Przemyslaw Biecek

TL;DR
This paper introduces region-constrained visual counterfactual explanations (RVCEs) to improve interpretability by limiting modifications to specific image regions, addressing issues of entanglement and confirmation bias in previous methods.
Contribution
The paper proposes a novel region-constrained approach for visual counterfactual explanations and introduces RCSB, a new method based on Schr"odinger Bridges, achieving state-of-the-art results.
Findings
Achieves a large margin improvement over previous methods
Enables exact counterfactual reasoning within predefined regions
Allows user interaction for region specification
Abstract
Visual counterfactual explanations (VCEs) have recently gained immense popularity as a tool for clarifying the decision-making process of image classifiers. This trend is largely motivated by what these explanations promise to deliver -- indicate semantically meaningful factors that change the classifier's decision. However, we argue that current state-of-the-art approaches lack a crucial component -- the region constraint -- whose absence prevents from drawing explicit conclusions, and may even lead to faulty reasoning due to phenomenons like confirmation bias. To address the issue of previous methods, which modify images in a very entangled and widely dispersed manner, we propose region-constrained VCEs (RVCEs), which assume that only a predefined image region can be modified to influence the model's prediction. To effectively sample from this subclass of VCEs, we propose…
Peer Reviews
Decision·ICLR 2025 Poster
The paper is clearly written, with most details of the proposed method presented effectively. It includes numerous qualitative results that demonstrate the method’s effectiveness and clearly define the problem.
1. The paper demonstrates that the proposed method achieves better performance when a semantic region constraint is applied. However, it would be helpful to explore the effects when the region is not semantically relevant—such as when using a randomly selected area with similar shape and size—to understand how critical semantic relevance is to the method’s effectiveness. 2. In Table 1, three configurations of the proposed method are presented as settings A, B, and C. While it appears these co
- The paper is very well-written overall and generally a pleasure to read; the various technical aspects of the method are very well-motivated, rigorously and clearly presented, and demonstrated to be useful in ablation experiments. - Region-constrained counterfactual explanations are certainly very useful (although I would argue that this approach is not completely novel, cf. my comments below) and I believe the explicit problem formulation and framing here can help move this field forward. The
**The relationship to several prior works should be discussed more comprehensively**, and possibly some of these methods should be included as baselines. The explicitly novel contributions should be highlighted more clearly. - Both [Jeanneret et al. (2023)](https://openaccess.thecvf.com/content/CVPR2023/papers/Jeanneret_Adversarial_Counterfactual_Visual_Explanations_CVPR_2023_paper.pdf) and [Weng et al. (2024)](https://link.springer.com/chapter/10.1007/978-3-031-73016-0_20) also use automaticall
- Bring the region constraint to the counterfactual example generation somehow makes sense to better align with human perception. - The authors provide a way to better inject the guidance from the gradient of the classifier f into the generation process of the inpainting model. From the visual examples, we can indeed observe a much higher quality.
- As the regions are the core input of the proposed method, how to define/find the regions is crucial. The method uses either the attribution map generation or the grounding method. Will these methods introduce additional bias in the understanding of the classifier? - From my perspective, the proposed method is more like just an image painting with specific guidance, where the guidance comes from the gradients of the classification model. The quality of the explanation ability somehow entangle
Code & Models
Videos
Taxonomy
TopicsData Visualization and Analytics
