Counterfactual Visual Explanation via Causally-Guided Adversarial Steering

Yiran Qiao; Disheng Liu; Yiren Lu; Yu Yin; Mengnan Du; Jing Ma

arXiv:2507.09881·cs.CV·September 30, 2025

Counterfactual Visual Explanation via Causally-Guided Adversarial Steering

Yiran Qiao, Disheng Liu, Yiren Lu, Yu Yin, Mengnan Du, Jing Ma

PDF

Open Access 3 Reviews

TL;DR

This paper introduces CECAS, a causally-guided adversarial framework for generating high-quality counterfactual visual explanations that respect causal relationships and reduce spurious correlations, improving interpretability.

Contribution

The paper proposes a novel causally-guided adversarial method for counterfactual explanations, addressing limitations of previous approaches by incorporating causal reasoning to enhance explanation quality.

Findings

01

Outperforms state-of-the-art methods on multiple benchmarks.

02

Achieves better validity, sparsity, proximity, and realism in explanations.

03

Effectively reduces spurious correlations in counterfactual images.

Abstract

Recent work on counterfactual visual explanations has contributed to making artificial intelligence models more explainable by providing visual perturbation to flip the prediction. However, these approaches neglect the causal relationships and the spurious correlations behind the image generation process, which often leads to unintended alterations in the counterfactual images and renders the explanations with limited quality. To address this challenge, we introduce a novel framework CECAS, which first leverages a causally-guided adversarial method to generate counterfactual explanations. It innovatively integrates a causal perspective to avoid unwanted perturbations on spurious factors in the counterfactuals. Extensive experiments demonstrate that our method outperforms existing state-of-the-art approaches across multiple benchmark datasets and ultimately achieves a balanced trade-off…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

- The paper tackles an important problem in counterfactual visual explanation by aiming to make generated examples causally faithful and semantically meaningful. - The experimental evaluation is extensive and covers multiple datasets, metrics, and strong baselines, which supports the empirical effectiveness of the results. - The paper is clearly written and easy to follow, with well-structured sections and illustrative figures that make the proposed method understandable.

Weaknesses

My main concern with this paper is that the proposed method is overly complex and lacks conceptual elegance. The pipeline combines several loosely connected components: a PGD-based adversarial attack, a VAE for causal–spurious disentanglement, a mask extraction step, and diffusion-based inpainting. Each of them introduces its own set of hyperparameters and loss terms. While these parts collectively improve visual quality, the overall design feels more like a layered engineering solution than a c

Reviewer 02Rating 2Confidence 4

Strengths

S1. The proposed approach is evaluated on a representative set of datasets with metrics covering the entire evaluation spectrum present in the current literature on CEs.

Weaknesses

W1. The proposed methodology is flawed at its core. The authors propose generating CEs with an additional constraint on the spurious correlations learned by the explained model that limits their appearance in the explanations. However, the primary role of CEs is to actually reveal these correlations, since they highlight the model's unreasonable failure cases, which is desirable. W2. The paper claims to be outperforming *recent state-of-the-art* methods for CE generation, while ignoring the act

Reviewer 03Rating 4Confidence 3

Strengths

The paper addresses an interesting problem of causality in images, and appears to avoid some unnecessary changes in the images based on the figures. The experimental evaluation is heavily in their method's favor, suggesting it is a worthwhile contribution. An extra qualitative evaluation adds weight to the actual image quality.

Weaknesses

In general, my main critique of the paper is that it appears to be quite incremental in its contributions. There are quite a few papers addressing the issue of counterfactual image generation, and the contribution of this one is not overly clear to me. For example, in the paper's claimed contributions they state "We provide a new perspective on the problem of counterfactual visual explanation by highlighting the critical role of causality". But please see e.g. [1], this just isn't true at all,

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Cell Image Analysis Techniques