Always Valid Inference: Bringing Sequential Analysis to A/B Testing
Ramesh Johari, Leo Pekelis, David J. Walsh

TL;DR
This paper introduces 'always valid' statistical methods for A/B testing that enable continuous monitoring and decision-making without inflating error rates, improving flexibility and reliability in sequential analysis.
Contribution
It develops a framework for valid sequential inference in A/B testing, allowing real-time data analysis and multiple hypothesis testing control with no prior knowledge of preferences.
Findings
Implemented in a large-scale commercial platform
Enables valid continuous monitoring of experiments
Balances sample size and power efficiently
Abstract
A/B tests are typically analyzed via frequentist p-values and confidence intervals; but these inferences are wholly unreliable if users endogenously choose samples sizes by *continuously monitoring* their tests. We define *always valid* p-values and confidence intervals that let users try to take advantage of data as fast as it becomes available, providing valid statistical inference whenever they make their decision. Always valid inference can be interpreted as a natural interface for a sequential hypothesis test, which empowers users to implement a modified test tailored to them. In particular, we show in an appropriate sense that the measures we develop tradeoff sample size and power efficiently, despite a lack of prior knowledge of the user's relative preference between these two goals. We also use always valid p-values to obtain multiple hypothesis testing control in the sequential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Advanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference
