Contamination Bias in Linear Regressions
Paul Goldsmith-Pinkham, Peter Hull, and Michal Koles\'ar

TL;DR
This paper reveals that linear regressions with multiple treatments and controls often produce biased estimates of treatment effects due to contamination, affecting observational studies more than experiments.
Contribution
It demonstrates the presence of contamination bias in linear regressions with multiple treatments and proposes three approaches to mitigate this bias.
Findings
Contamination bias affects treatment effect estimates in observational studies.
Experimental studies show less contamination bias due to smaller propensity score variability.
Re-analysis of nine studies confirms significant contamination bias in practice.
Abstract
We study regressions with multiple treatments and a set of controls that is flexible enough to purge omitted variable bias. We show that these regressions generally fail to estimate convex averages of heterogeneous treatment effects -- instead, estimates of each treatment's effect are contaminated by non-convex averages of the effects of other treatments. We discuss three estimation approaches that avoid such contamination bias, including the targeting of easiest-to-estimate weighted average effects. A re-analysis of nine empirical applications finds economically and statistically meaningful contamination bias in observational studies; contamination bias in experimental studies is more limited due to smaller variability in propensity scores.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
