Setting the duration of online A/B experiments
Harrison H. Li, Chaoyu Yu

TL;DR
This paper derives an analytical formula to determine the optimal duration of online A/B experiments by modeling how confidence interval width depends on sample size, duration, and user-specific temporal correlation, with validation on YouTube data.
Contribution
It introduces a new formula linking CI width with experiment duration and user correlation, enabling better experiment planning and resource allocation.
Findings
The derived formula accurately predicts CI width in real experiments.
Higher user-specific temporal correlation slows CI width decay over time.
Pre-period data influences the rate at which CI width decreases.
Abstract
In designing an online A/B experiment, it is crucial to select a sample size and duration that ensure the resulting confidence interval (CI) for the treatment effect is the right width to detect an effect of meaningful magnitude with sufficient statistical power without wasting resources. While the relationship between sample size and CI width is well understood, the effect of experiment duration on CI width remains less clear. This paper provides an analytical formula for the width of a CI based on a ratio treatment effect estimator as a function of both sample size (N) and duration (T). The formula is derived from a mixed effects model with two variance components. One component, referred to as the temporal variance, persists over time for experiments where the same users are kept in the same experiment arm across different days. The remaining error variance component, by contrast,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth, Environment, Cognitive Aging · Cell Image Analysis Techniques
