Data Integration for Estimating Subgroup-Specific Conditional Average Treatment Effects (CATEs) Using Coarsened External Information in Randomized Trials
Youqi Yang, Walter Dempsey, and Bhramar Mukherjee

TL;DR
This paper introduces a novel James-Stein-type estimator that leverages coarsened external data to improve subgroup-specific treatment effect estimates in RCTs, especially when internal data are sparse.
Contribution
It develops a new shrinkage estimator that combines internal and external data for finer subgroup CATE estimation, accommodating population differences.
Findings
The estimator outperforms traditional methods in simulations.
It detects subgroup effects not visible with internal data alone.
Application to weight-loss trials reveals significant subgroup differences.
Abstract
Randomized controlled trials (RCTs) are often underpowered to detect treatment heterogeneity in subgroups defined by cross-classifications of multiple covariates, due to sparse sample sizes in some strata. External RCT data can help, but typically provide treatment effect estimates at a coarser level (e.g., by sex or race) rather than for the finer subgroups of interest (e.g., race-by-sex). We propose a novel James-Stein (JS)-type estimator that borrows strength from such coarsened external estimates to improve estimation of finer subgroup-specific conditional average treatment effects (CATEs) in an internal study, while accommodating potential incompatibility in marginal CATEs across populations. Based on asymptotic theory, we derive a practical analytic variance estimator for the JS estimator that exhibits acceptable empirical performance. Under mild conditions, we show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
