ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks
Yu Zhang, Sean Bin Yang, Arijit Khan, Cuneyt Gurcan Akcora

TL;DR
ATEX-CF introduces a unified framework combining adversarial attack techniques with counterfactual explanations to generate faithful, sparse, and plausible explanations for graph neural networks by jointly optimizing multiple criteria.
Contribution
This work presents the first method to integrate adversarial attack insights into counterfactual explanation generation for GNNs, allowing both edge additions and deletions.
Findings
ATEX-CF produces more faithful explanations than traditional methods.
The method achieves high plausibility and sparsity in explanations.
Experimental results on benchmarks validate its effectiveness.
Abstract
Counterfactual explanations offer an intuitive way to interpret graph neural networks (GNNs) by identifying minimal changes that alter a model's prediction, thereby answering "what must differ for a different outcome?". In this work, we propose a novel framework, ATEX-CF that unifies adversarial attack techniques with counterfactual explanation generation-a connection made feasible by their shared goal of flipping a node's prediction, yet differing in perturbation strategy: adversarial attacks often rely on edge additions, while counterfactual methods typically use deletions. Unlike traditional approaches that treat explanation and attack separately, our method efficiently integrates both edge additions and deletions, grounded in theory, leveraging adversarial insights to explore impactful counterfactuals. In addition, by jointly optimizing fidelity, sparsity, and plausibility under a…
Peer Reviews
Decision·ICLR 2026 Poster
1. Strong theoretical and empirical evidence justifies including edge additions for counterfactual explainability. 2. The attack guided candidate generation is innovative; the signed mask, STE, and pruning pipeline coherently addresses both search efficiency and plausibility. 3. ATEX-CF achieves higher flip rates using fewer, more plausible edits across various datasets and GNN models. 4. The work is framed as the first to bridge GNN adversarial attacks with counterfactual explanations, distingu
1. The pipeline is complex and densely explained. 2. The additions and deletions are assumed to be equally feasible, asymmetric costs are not explored. 3. Evaluation misses some recent counterfactual baselines that also permit additions, such as (InduCE: InduCE: Inductive Counterfactual Explanations for Graph Neural Networks | OpenReview ), slightly weakening the claim of empirical dominance. 4. Released GitHub code is difficult for international researchers to reproduce, as some comments are i
1. The idea is interesting. While some recent work also considered edge adding, it is still novel to use GNN attack as a source of obtaining edges to add; 2. The paper is easy to follow as they spent many efforts to motivate their research and explain their methodology. However, the space left for the experiments looks relatively too short (this can be improved); 3. The related work section (though in the appendix) is very detailed.
The experiments are not convincing due to two facts: 1. They considered very limited baselines. Particularly, as a CF-explanation method, they only considered one method (CF-GNNExplainer) as a baseline. However, there are so many other CF-explanation methods (they also reviewed these methods in related work), including RCExplainer, GNN-MOExp, CF$^2$, NSEG, Banzhaf, CF-GFExplainer; INDUCE, C2Explainer, CLEAR, GCFExplainer, etc. (Actions: add more baselines in this category) 2. The datasets cons
- Originality: The paper connects adversarial edge additions to counterfactual generation, which usually uses edge deletions instead. - Solid optimization mechanics: The paper involves signed mask with ternary forward discretization, top-κ budget, and minimality-aware pruning that reduces edits/runtime while preserving flips/plausibility. - Reproducibility and robustness checks: Code and configs released with means/SDs across seeds; sensitivity and pruning analyses quantify stability and efficie
- Theory: “Hypothesis 1” is asserted as “proved” in the appendix but functions more as an assumption tied to empirical overlap; this weakens guarantees. - Plausibility metric is coarse: degree/clustering penalties may not capture domain constraints like temporal or type compatibility; evaluation reuses the same surrogate. - Dependence on attack generator: Candidate additions come from GOttack; robustness to this choice and ablations vs. simpler heuristics are unclear. - Scope: Only node classifi
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
