Adjusting for indirectly measured confounding using large-scale propensity scores
Linying Zhang, Yixin Wang, Martijn Schuemie, David Blei, George, Hripcsak

TL;DR
This paper explores how large-scale propensity scores can effectively adjust for both directly and indirectly measured confounders in observational medical data, improving causal inference accuracy.
Contribution
It introduces conditions under which large-scale propensity scores remove bias from indirectly measured confounders and demonstrates their effectiveness with simulated and real medical data.
Findings
LSPS can adjust for indirectly measured confounders using large covariate sets.
LSPS may avoid bias by not adjusting for colliders.
Empirical results show improved bias reduction in medical datasets.
Abstract
Confounding remains one of the major challenges to causal inference with observational data. This problem is paramount in medicine, where we would like to answer causal questions from large observational datasets like electronic health records (EHRs) and administrative claims. Modern medical data typically contain tens of thousands of covariates. Such a large set carries hope that many of the confounders are directly measured, and further hope that others are indirectly measured through their correlation with measured covariates. How can we exploit these large sets of covariates for causal inference? To help answer this question, this paper examines the performance of the large-scale propensity score (LSPS) approach on causal analysis of medical data. We demonstrate that LSPS may adjust for indirectly measured confounders by including tens of thousands of covariates that may be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Health Systems, Economic Evaluations, Quality of Life · Statistical Methods and Inference
