Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation
Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis

TL;DR
This paper empirically benchmarks various surrogate metrics for model selection in causal inference, specifically for CATE estimation, highlighting the importance of hyperparameter tuning and causal ensembling for improved performance.
Contribution
It provides a comprehensive empirical comparison of surrogate model selection metrics for CATE estimation and introduces new strategies based on hyperparameter tuning and ensembling.
Findings
Surrogate metrics vary in effectiveness across datasets.
Hyperparameter tuning significantly impacts model selection.
Causal ensembling improves CATE estimation accuracy.
Abstract
We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation. Unlike machine learning, there is no perfect analogue of cross-validation for model selection as we do not observe the counterfactual potential outcomes. Towards this, a variety of surrogate metrics have been proposed for CATE model selection that use only observed data. However, we do not have a good understanding regarding their effectiveness due to limited comparisons in prior studies. We conduct an extensive empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work. We ensure a fair comparison by tuning the hyperparameters associated with these metrics via AutoML, and provide more detailed trends by incorporating realistic datasets via generative modeling. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference · Statistical Methods and Inference
