Learning Counterfactually Invariant Predictors
Francesco Quinzan, Cecilia Casolo, Krikamol Muandet, Yucen Luo, Niki, Kilbertus

TL;DR
This paper introduces a graphical criterion for identifying counterfactually invariant predictors and proposes a kernel-based framework, CIP, to learn such predictors, enhancing fairness and robustness in real-world applications.
Contribution
The paper provides a graphical criterion for counterfactual invariance and develops CIP, a model-agnostic method using HSCIC to enforce this invariance in predictors.
Findings
CIP effectively enforces counterfactual invariance in diverse datasets.
Graphical criteria offer a sufficient condition for counterfactual invariance.
Experimental results show improved fairness and robustness.
Abstract
Notions of counterfactual invariance (CI) have proven essential for predictors that are fair, robust, and generalizable in the real world. We propose graphical criteria that yield a sufficient condition for a predictor to be counterfactually invariant in terms of a conditional independence in the observational distribution. In order to learn such predictors, we propose a model-agnostic framework, called Counterfactually Invariant Prediction (CIP), building on the Hilbert-Schmidt Conditional Independence Criterion (HSCIC), a kernel-based conditional dependence measure. Our experimental results demonstrate the effectiveness of CIP in enforcing counterfactual invariance across various simulated and real-world datasets including scalar and multi-variate settings.
Peer Reviews
Decision·Submitted to ICLR 2024
1. Theoretical analysis is sufficient, and the studied problem isinteresting.
1. My main concern is on the usage of counterfactual invariance. Counterfactual refers to individual-level potential outcomes, rather than conditional or sub-populational levels. Hence, pursuing counterfactual outcome relies on prior SCM model or very sharp bounds, and estimating counterfactual outcome from observational is nearly possible even with the aid of AB tests. Hence, as your independence regularization only enforces tthe populational independence, how can your CIP achieves targets in t
Their experimental results demonstrate the effectiveness of their method in enforcing counterfactual invariance across various simulated and real-world datasets including scalar and multi-variate settings.
Please refer to Questions.
- Learning counterfactually invariant predictors solely from the observational distribution is a relevant and important research problem in the field of causal inference. - The extensive experiments on synthetic data and real data are conducted to demonstrate how the proposed method works.
- A strong limitation of Theorem 3.2, which corresponds to the issue in the experiments mentioned next. The assumption made in Theorem 3.2, which presumes that $X = g(X, ...)$, seems to be a strong assumption. The authors clarify that this assumption implies $pa(V) \in \mathbf{X} \cup \mathbf{A}$, which should imply $\mathbf{A} \cup \mathbf{X}$ should be a maximal connected graph (and so $\mathbf{A} \cup \mathbf{W}$ ) and $Y$ is a root node, meaning $Y$ cannot be any parent of $\mathbf{X} \cup \
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
