More power to you: Using machine learning to augment human coding for more efficient inference in text-based randomized trials
Reagan Mozer, Luke Miratrix

TL;DR
This paper introduces a machine learning-based framework that enhances the power of impact assessments in text-based randomized trials by leveraging untapped unscored documents, reducing human coding efforts while maintaining statistical power.
Contribution
It presents a novel inferential framework combining causal inference, survey sampling, and machine learning to improve impact estimation efficiency in text-based trials.
Findings
Reduces human coding effort while maintaining power.
Effective in simulation and real-world education trial.
Provides unbiased treatment impact estimates.
Abstract
For randomized trials that use text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by trained human raters. This process, the current standard, is both time-consuming and limiting: even the largest human coding efforts are typically constrained to measure only a small set of dimensions across a subsample of available texts. In this work, we present an inferential framework that can be used to increase the power of an impact assessment, given a fixed human-coding budget, by taking advantage of any "untapped" observations -- those documents not manually scored due to time or resource constraints -- as a supplementary resource. Our approach, a methodological combination of causal inference, survey sampling methods, and machine learning, has four steps: (1) select and code a sample of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Meta-analysis and systematic reviews · Health Policy Implementation Science
