TL;DR
This paper systematically analyzes the R-Learner for network causal inference, revealing a critical representation bottleneck and proposing an end-to-end Graph R-Learner that outperforms traditional methods.
Contribution
It provides the first rigorous evidence of a representation bottleneck in R-Learners on graphs and introduces a successful end-to-end Graph R-Learner approach.
Findings
Graph-blind final-stage R-Learners fail completely on graph data.
End-to-end Graph R-Learner significantly outperforms non-DML GNN T-Learner.
Identifies a topology-dependent nuisance bottleneck linked to GNN over-squashing.
Abstract
The R-Learner is a powerful, theoretically-grounded framework for estimating heterogeneous treatment effects, prized for its robustness to nuisance model errors. However, its application to network data, where causal heterogeneity is often graph-dependent, presents a critical challenge to its core assumption of a well-specified final-stage model. In this paper, we conduct a large-scale empirical study to systematically dissect the R-Learner framework on graphs. We provide the first rigorous evidence that the primary driver of performance is the inductive bias of the final-stage CATE estimator, an effect that dominates the choice of nuisance models. Our central finding is the quantification of a catastrophic "representation bottleneck": we prove with overwhelming statistical significance (p < 0.001) that R-Learners with a graph-blind final stage fail completely (MSE > 4.0), even when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
