Generalizing Trimming Bounds for Endogenously Missing Outcome Data Using Random Forests
Cyrus Samii, Ye Wang, Junlong Aaron Zhou

TL;DR
This paper introduces a method using generalized random forests to tighten non-parametric bounds on treatment effects in studies with outcome data missing due to endogenous selection, improving inference without strong assumptions.
Contribution
It develops a novel approach for narrowing bounds on treatment effects with many covariates, leveraging random forests for honest, assumption-agnostic inference.
Findings
Narrower bounds achieved in simulations and replications.
Method effectively adjusts for numerous covariates.
Demonstrates practical benefits over traditional bounding methods.
Abstract
In many experimental or quasi-experimental studies, outcomes of interest are only observed for subjects who select (or are selected) to engage in the activity generating the outcome. Outcome data is thus endogenously missing for units who do not engage, in which case random or conditionally random treatment assignment prior to such choices is insufficient to point identify treatment effects. Non-parametric partial identification bounds are a way to address endogenous missingness without having to make disputable parametric assumptions. Basic bounding approaches often yield bounds that are very wide and therefore minimally informative. We present methods for narrowing non-parametric bounds on treatment effects by adjusting for potentially large numbers of covariates, working with generalized random forests. Our approach allows for agnosticism about the data-generating process and honest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
