Improving Bias Correction Standards by Quantifying its Effects on Treatment Outcomes
Alexandre Abraham, Andr\'es Hoyos Idrobo

TL;DR
This paper introduces A2A, a novel metric for evaluating bias correction in propensity score matching, significantly improving the accuracy and reproducibility of treatment effect estimates in health data analysis.
Contribution
The authors develop A2A, a new metric that assesses outcome correction accuracy and automate the PSM pipeline, integrating it into the popmatch Python package.
Findings
A2A reduces ATE estimation errors by up to 50% on synthetic data.
A2A decreases predicted ATE variability by up to 90%.
Automated pipeline enhances reproducibility and accessibility of bias correction methods.
Abstract
With the growing access to administrative health databases, retrospective studies have become crucial evidence for medical treatments. Yet, non-randomized studies frequently face selection biases, requiring mitigation strategies. Propensity score matching (PSM) addresses these biases by selecting comparable populations, allowing for analysis without further methodological constraints. However, PSM has several drawbacks. Different matching methods can produce significantly different Average Treatment Effects (ATE) for the same task, even when meeting all validation criteria. To prevent cherry-picking the best method, public authorities must involve field experts and engage in extensive discussions with researchers. To address this issue, we introduce a novel metric, A2A, to reduce the number of valid matches. A2A constructs artificial matching tasks that mirror the original ones but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
