Counterfactual Explanations on Robust Perceptual Geodesics
Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, Fred Roosta

TL;DR
This paper introduces Perceptual Counterfactual Geodesics (PCG), a method that generates semantically meaningful counterfactual explanations by tracing geodesics in a perceptually aligned Riemannian space, improving on existing approaches.
Contribution
The paper proposes a novel geodesic-based approach using a perceptual Riemannian metric for counterfactual explanations, addressing limitations of previous flat or misaligned geometry methods.
Findings
PCG outperforms baseline methods on three vision datasets.
It produces smooth, on-manifold, semantically valid transitions.
Reveals failure modes hidden under standard metrics.
Abstract
Latent-space optimization methods for counterfactual explanations - framed as minimal semantic perturbations that change model predictions - inherit the ambiguity of Wachter et al.'s objective: the choice of distance metric dictates whether perturbations are meaningful or adversarial. Existing approaches adopt flat or misaligned geometries, leading to off-manifold artifacts, semantic drift, or adversarial collapse. We introduce Perceptual Counterfactual Geodesics (PCG), a method that constructs counterfactuals by tracing geodesics under a perceptually Riemannian metric induced from robust vision features. This geometry aligns with human perception and penalizes brittle directions, enabling smooth, on-manifold, semantically valid transitions. Experiments on three vision datasets show that PCG outperforms baselines and reveals failure modes hidden under standard metrics.
Peer Reviews
Decision·ICLR 2026 Poster
- Strong motivation: Paper clearly articulates three speicific failure modes of prior latent space methods: 1. off-manifold traversal leading to artifacts, 2. local gradient optimization that ignores global structure, 3. generator exploitation of non-robust metrics. - Perceptual Counterfactual Geodesics (PCG) is logical and well-structured. It employs a two-phase optimization process. This involves first finding an energy-minimizing geodesic, and tehn jointly refining the path and its endpoint w
- The paper's own conclusion frames it's contribution as operationalizing established ideas from pullback geometry and robust perception. The technical novelty is incremental, rather than a fundamental new theory. - The paper's main quantitative results is incomplete, with runtime comparisions relegated to appendix. There is limited discussion of scalability to longer paths or higher resolution images. - The justification for why robust features necessarily yield semantically meaningful geodesic
S1. The considered problem is often overlooked in the CE context, especially in the computer vision domain, where human perception may be easily fooled. It is also of extreme importance to the explainability community -- since CEs stand at the top of the Pearl's causality ladder, it is crucial to generate explanations that are truly valid. The authors provide an elegant solution to this problem, with clearly highlighted motivation and proper mathematical formalism. S2. The proposed PCG algorith
W1. Robust models are never "infinitely" robust (lines 225-227), meaning that their robustness is preserved up to some nieghborhood of each point. What are the ways of measuring this robustness? How does the level of this robustness influence the resulting induced geometry? What limitations spark from that and how can they be overcome? W2. Is there some theoretical justification for the definition of the robust perceptual metric $G_R$ (its equation is not numbered, but can be seen at line 234)?
- The proposed method is well-motivated and extensively described in the paper - The contribution of the paper is situated within the existing literature - The empirical results in the paper are encouraging, both for the geodesics and the generated counterfactual explanations - The paper is clearly written
**Limited empirical results:** While the motivation and description of the proposed method are extensive, the breadth of the empirical results is surprisingly limited. By this I mean that: - The qualitative examples in the paper are almost all either for humans or for cats and dogs. So there is a limited breadth in terms of the classes for which the counterfactuals are explored. - The classifiers for which the counterfactuals are generated are VGG-19 backbones trained on a binary classification
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
