Variance reduction combining pre-experiment and in-experiment data
Zhexiao Lin, Pablo Crespo

TL;DR
This paper introduces a robust, scalable framework that combines pre-experiment and in-experiment data to reduce variance in A/B testing, enhancing sensitivity and decision-making efficiency.
Contribution
It proposes a new, general method for variance reduction that effectively integrates both pre- and in-experiment data, with theoretical guarantees and practical validation.
Findings
Significant variance reduction in multiple Etsy experiments
Framework outperforms existing methods like CUPED and CUPAC
Effective even with few post-treatment covariates
Abstract
Online controlled experiments (A/B testing) are fundamental to data-driven decision-making in many companies. Improving the sensitivity of these experiments under fixed sample size constraints requires reducing the variance of the average treatment effect (ATE) estimator. Existing variance reduction techniques such as CUPED and CUPAC use pre-experiment data, but their effectiveness depends on how predictive those data are for outcomes measured during the experiment. In-experiment data are often more strongly correlated with the outcome, but using arbitrary post-treatment variables can introduce bias. In this paper, we propose a general, robust, and scalable framework that combines both pre-experiment and in-experiment data to achieve variance reduction. Our framework is simple, interpretable, and computationally efficient, making it practical for real-world deployment. We develop the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Optimal Experimental Design Methods · Fault Detection and Control Systems
