Power Analysis for Experiments with Clustered Data, Ratio Metrics, and Regression for Covariate Adjustment
Tim Hesterberg, Ben Knight (Instacart)

TL;DR
This paper presents a unified, stable framework for calculating standard errors in A/B tests involving clustered data, ratio metrics, and covariate adjustment, enabling more accurate power analysis and hypothesis testing.
Contribution
It introduces a common, software-compatible framework for standard error calculation that simplifies analysis of complex experimental data involving covariates and clustering.
Findings
Covariate adjustment reduces variance by median 66%.
Experiment run time decreases by median 66%.
Framework is compatible with standard tools and numerically stable.
Abstract
We describe how to calculate standard errors for A/B tests that include clustered data, ratio metrics, and/or covariate adjustment. We may do this for power analysis/sample size calculations prior to running an experiment using historical data, or after an experiment for hypothesis testing and confidence intervals. The different applications have a common framework, using the sample variance of certain residuals. The framework is compatible with modular software, can be plugged into standard tools, doesn't require computing covariance matrices, and is numerically stable. Using this approach we estimate that covariate adjustment gives a median 66% variance reduction for a key metric, reducing experiment run time by 66%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Multi-Criteria Decision Making
