Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm
R. Teal Witter, Christopher Musco

TL;DR
This paper introduces a new dataset and benchmark for evaluating estimators of treatment effects in natural experiments, demonstrating the superior performance of doubly robust estimators and proposing a novel estimator with improved variance properties.
Contribution
The paper provides a novel natural experiment dataset, a comprehensive benchmark for estimator evaluation, and a new doubly robust estimator with a closed-form variance expression.
Findings
Doubly robust estimators outperform other methods by orders of magnitude.
The benchmark reveals estimator performance varies with sample size, treatment correlation, and propensity score accuracy.
A new estimator with a novel loss function improves variance properties.
Abstract
Estimating the effect of treatments from natural experiments, where treatments are pre-assigned, is an important and well-studied problem. We introduce a novel natural experiment dataset obtained from an early childhood literacy nonprofit. Surprisingly, applying over 20 established estimators to the dataset produces inconsistent results in evaluating the nonprofit's efficacy. To address this, we create a benchmark to evaluate estimator accuracy using synthetic outcomes, whose design was guided by domain experts. The benchmark extensively explores performance as real world conditions like sample size, treatment correlation, and propensity score accuracy vary. Based on our benchmark, we observe that the class of doubly robust treatment effect estimators, which are based on simple and intuitive regression adjustment, generally outperform other more complicated estimators by orders of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Fault Detection and Control Systems
