Improving the Finite Sample Estimation of Average Treatment Effects using Double/Debiased Machine Learning with Propensity Score Calibration
Daniele Ballinari, Nora Bearth

TL;DR
This paper enhances the accuracy of average treatment effect estimation in finite samples by integrating propensity score calibration into double/debiased machine learning, reducing errors and improving reliability.
Contribution
It introduces the use of probability calibration methods within the DML framework to improve finite sample estimation of causal effects.
Findings
Calibration significantly reduces root mean squared error in simulations.
Calibrated propensity scores do not affect asymptotic properties.
Empirical example demonstrates practical benefits of calibration.
Abstract
In the last decade, machine learning techniques have gained popularity for estimating causal effects. One machine learning approach that can be used for estimating an average treatment effect is Double/debiased machine learning (DML) (Chernozhukov et al., 2018). This approach uses a double-robust score function that relies on the prediction of nuisance functions, such as the propensity score, which is the probability of treatment assignment conditional on covariates. Estimators relying on double-robust score functions are highly sensitive to errors in propensity score predictions. Machine learners increase the severity of this problem as they tend to over- or underestimate these probabilities. Several calibration approaches have been proposed to improve probabilistic forecasts of machine learners. This paper investigates the use of probability calibration approaches within the DML…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
