Large-Scale Online Experimentation with Quantile Metrics
Min Liu, Xiaohui Sun, Maneesh Varshney, Ya Xu

TL;DR
This paper introduces a fast, scalable, and statistically valid method for conducting A/B tests with quantile metrics, addressing a key challenge in large-scale online experimentation.
Contribution
It presents a novel variance estimation technique for quantile metrics that is both accurate and computationally efficient, suitable for large datasets.
Findings
Achieves over 500x speedup compared to bootstrap methods.
Maintains only 2% difference from bootstrap estimates.
Provides a scalable solution for quantile-based A/B testing.
Abstract
Online experimentation (or A/B testing) has been widely adopted in industry as the gold standard for measuring product impacts. Despite the wide adoption, few literatures discuss A/B testing with quantile metrics. Quantile metrics, such as 90th percentile page load time, are crucial to A/B testing as many key performance metrics including site speed and service latency are defined as quantiles. However, with LinkedIn's data size, quantile metric A/B testing is extremely challenging because there is no statistically valid and scalable variance estimator for the quantile of dependent samples: the bootstrap estimator is statistically valid, but takes days to compute; the standard asymptotic variance estimate is scalable but results in order-of-magnitude underestimation. In this paper, we present a statistically valid and scalable methodology for A/B testing with quantiles that is fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Online Learning and Analytics · Advanced Multi-Objective Optimization Algorithms
