Design-Based Confidence Sequences: A General Approach to Risk Mitigation in Online Experimentation
Dae Woong Ham, Iavor Bojinov, Michael Lindon, Martin Tingley

TL;DR
This paper introduces a general method for constructing valid confidence sequences in online experiments, enabling risk mitigation and early stopping without invalidating statistical guarantees.
Contribution
It provides a novel, assumption-light framework for confidence sequences applicable to various experimental designs, including multi-arm bandits and time series, with variance reduction techniques.
Findings
Confidence sequences control type-1 error over time
Early stopping can be achieved after few observations
Real-world Netflix experiments demonstrate practical effectiveness
Abstract
Randomized experiments have become the standard method for companies to evaluate the performance of new products or services. In addition to augmenting managers' decision-making, experimentation mitigates risk by limiting the proportion of customers exposed to innovation. Since many experiments are on customers arriving sequentially, a potential solution is to allow managers to "peek" at the results when new data becomes available and stop the test if the results are statistically significant. Unfortunately, peeking invalidates the statistical guarantees for standard statistical analysis and leads to uncontrolled type-1 error. Our paper provides valid design-based confidence sequences, sequences of confidence intervals with uniform type-1 error guarantees over time for various sequential experiments in an assumption-light manner. In particular, we focus on finite-sample estimands…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Statistical Methods in Clinical Trials
