Strategy to select most efficient RCT samples based on observational data
Wenqi Shi, Xi Lin

TL;DR
This paper proposes a novel design strategy for selecting RCT samples based on observational data to maximize estimation efficiency and reduce variance in causal effect estimation.
Contribution
It introduces an optimal covariate allocation method during the experimental design stage, improving efficiency over traditional approaches.
Findings
Optimal covariate allocation minimizes variance.
Adjusted allocation differs from target population distribution.
Practical strategies accommodate cost and precision constraints.
Abstract
Randomized experiments can provide unbiased estimates of sample average treatment effects. However, estimates of population treatment effects can be biased when the experimental sample and the target population differ. In this case, the population average treatment effect can be identified by combining experimental and observational data. A good experiment design trumps all the analyses that come after. While most of the existing literature centers around improving analyses after RCTs, we instead focus on the design stage, fundamentally improving the efficiency of the combined causal estimator through the selection of experimental samples. We explore how the covariate distribution of RCT samples influences the estimation efficiency and derive the optimal covariate allocation that leads to the lowest variance. Our results show that the optimal allocation does not necessarily follow the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Bayesian Inference · Economic and Environmental Valuation
