Machine Learning for Experimental Design: Methods for Improved Blocking

Brian Quistorff; Gentry Johnson

arXiv:2010.15966·econ.EM·November 2, 2020

Machine Learning for Experimental Design: Methods for Improved Blocking

Brian Quistorff, Gentry Johnson

PDF

Open Access

TL;DR

This paper explores how machine learning techniques can enhance experimental design by automating blocking and stratification, leading to more accurate treatment effect estimates in small to medium-sized experiments.

Contribution

It introduces improved methods using ML to identify important covariates for blocking, addressing limitations of existing guidance and automating the process.

Findings

01

Reduced mean squared error by 14%-34%

02

Lowered standard error by 6%-16%

03

Demonstrated effectiveness with real-world data

Abstract

Restricting randomization in the design of experiments (e.g., using blocking/stratification, pair-wise matching, or rerandomization) can improve the treatment-control balance on important covariates and therefore improve the estimation of the treatment effect, particularly for small- and medium-sized experiments. Existing guidance on how to identify these variables and implement the restrictions is incomplete and conflicting. We identify that differences are mainly due to the fact that what is important in the pre-treatment data may not translate to the post-treatment data. We highlight settings where there is sufficient data to provide clear guidance and outline improved methods to mostly automate the process using modern machine learning (ML) techniques. We show in simulations using real-world data, that these methods reduce both the mean squared error of the estimate (14%-34%) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Statistical Methods in Clinical Trials · Advanced Causal Inference Techniques