Combining Incomplete Observational and Randomized Data for Heterogeneous Treatment Effects
Dong Yao, Caizhi Tang, Qing Cui, Longfei Li

TL;DR
This paper introduces CIO, a novel method that effectively combines incomplete observational data with randomized data to estimate heterogeneous treatment effects, overcoming limitations of existing methods that require complete observational data.
Contribution
The paper proposes a resilient approach called CIO that estimates HTEs using incomplete observational data combined with randomized data, without requiring full observational data coverage.
Findings
CIO accurately estimates HTEs on synthetic datasets.
CIO outperforms existing methods with complete observational data.
The approach is validated on semi-synthetic datasets.
Abstract
Data from observational studies (OSs) is widely available and readily obtainable yet frequently contains confounding biases. On the other hand, data derived from randomized controlled trials (RCTs) helps to reduce these biases; however, it is expensive to gather, resulting in a tiny size of randomized data. For this reason, effectively fusing observational data and randomized data to better estimate heterogeneous treatment effects (HTEs) has gained increasing attention. However, existing methods for integrating observational data with randomized data must require \textit{complete} observational data, meaning that both treated subjects and untreated subjects must be included in OSs. This prerequisite confines the applicability of such methods to very specific situations, given that including all subjects, whether treated or untreated, in observational studies is not consistently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
