
TL;DR
This paper introduces a decomposition of ML prediction into components relevant for causal analysis, proposing a new metric to improve model selection and treatment effect estimation from panel data.
Contribution
It presents a novel decomposition of predictions into three components, highlighting the importance of within-unit-across-time accuracy for causal inference, and proposes a diagnostic metric for model selection.
Findings
Counterfactual-treatment-effect component is key for true treatment effect recovery.
Within-unit-across-time prediction accuracy correlates better with causal effect than overall accuracy.
The proposed metric enables model diagnostics and unbiased treatment effect estimation under certain assumptions.
Abstract
There is rising interest in using Machine Learning (ML) model predictions as outcomes in causal analysis. However, these methods have faced challenges in finding the true treatment effects. It is also challenging to make choices about which prediction models to choose, since we are interested not only in the accuracy of the prediction but in its ability to produce the correct causal effect in the analysis. In this paper I propose a decomposition of the prediction into between-unit prediction (), within-unit-across-time prediction (), and counterfactual-treatment-effect prediction (). I show that the counterfactual-treatment-effect component is the one that determines whether the model recovers the true treatment effect, but only the first two components can be estimated from non-experimental data. I argue that within-unit-across-time prediction accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
