Finite-sample bias-variance tradeoff with variables related to trial participation inserted into causal forest models for ensuring generalizability
Rikuta Hamaya, Etsuji Suzuki, Konan Hara

TL;DR
This paper investigates the bias-variance tradeoff in causal forest models for estimating treatment effects from RCTs, highlighting that including trial participation variables can inflate variance and suggesting IPW as a better alternative.
Contribution
It demonstrates that incorporating variables related to trial participation in causal forests can harm precision in finite samples and advocates for addressing selection bias separately with IPW methods.
Findings
Including more than 3 participation-related covariates degrades precision unless sample sizes are large.
Inverse probability weighting (IPW) improves performance across various scenarios.
Application to a real RCT shows IPW refines heterogeneity estimates and shifts effects toward the source population.
Abstract
Estimating conditional average treatment effects (CATE) from randomized controlled trials (RCTs) and generalizing them to broader populations is essential for personalizing treatment rules but is complicated by selection bias due to trial participation and potentially high dimensional covariates. We evaluated finite sample bias variance tradeoff for Causal Forest based CATE estimation strategies to address the selection bias. Identification theory suggests unbiased CATE estimation is possible when covariates related to trial participation are included in CATE estimating models. However, simulation studies demonstrated that, under realistic RCT sample sizes, variance inflation from high dimensional covariates often outweighed modest bias reduction. In our data generating process that define individual treatment effect (ITE) in source population and selected trial samples, including more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
