Propensity score models are better when post-calibrated
Rom Gutman, Ehud Karavani, Yishai Shimoni

TL;DR
This paper demonstrates that post-calibrating propensity scores improves causal effect estimates, especially for flexible models that produce uncalibrated scores, without relying on better balancing.
Contribution
It provides empirical evidence that simple post-calibration enhances the accuracy of propensity score-based causal inference for expressive models.
Findings
Post-calibration reduces error in effect estimation.
Improvement is larger when initial scores are poorly calibrated.
Post-calibration is computationally inexpensive and effective.
Abstract
Theoretical guarantees for causal inference using propensity scores are partly based on the scores behaving like conditional probabilities. However, scores between zero and one, especially when outputted by flexible statistical estimators, do not necessarily behave like probabilities. We perform a simulation study to assess the error in estimating the average treatment effect before and after applying a simple and well-established post-processing method to calibrate the propensity scores. We find that post-calibration reduces the error in effect estimation for expressive uncalibrated statistical estimators, and that this improvement is not mediated by better balancing. The larger the initial lack of calibration, the larger the improvement in effect estimation, with the effect on already-calibrated estimators being very small. Given the improvement in effect estimation and that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference · Economic Policies and Impacts
