RCT Rejection Sampling for Causal Estimation Evaluation
Katherine A. Keith, Sergey Feldman, David Jurgens, Jonathan Bragg,, Rohit Bhattacharya

TL;DR
This paper introduces RCT rejection sampling, a new method for creating confounded observational datasets from RCTs, enabling more accurate evaluation of causal inference methods in high-dimensional settings.
Contribution
The paper proposes RCT rejection sampling with theoretical guarantees, demonstrating its effectiveness through synthetic data and a real-world RCT with high-dimensional covariates.
Findings
RCT rejection sampling achieves low bias in causal effect estimation
The method provides valid causal identification guarantees
Finite data considerations are crucial for practical implementation
Abstract
Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. For settings with high-dimensional covariates -- such as text data, genomics, or the behavioral social sciences -- researchers have proposed methods to adjust for confounding by adapting machine learning methods to the goal of causal estimation. However, empirical evaluation of these adjustment methods has been challenging and limited. In this work, we build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data: subsampling randomized controlled trials (RCTs) to create confounded observational datasets while using the average causal effects from the RCTs as ground-truth. We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Bayesian Inference · Bayesian Modeling and Causal Inference
