Automatic Detection and Diagnosis of Biased Online Experiments
Nanyu Chen, Min Liu, Ya Xu

TL;DR
This paper presents an automated system for detecting and diagnosing common biases in online A/B experiments, enhancing the reliability and decision-making process in large-scale experimentation platforms.
Contribution
It introduces scalable algorithms for automatic detection of four common biases in online experiments, advancing the development of intelligent A/B testing platforms.
Findings
Successfully identified design-imposed bias, self-selection bias, novelty effect, and trigger-day effect.
Automated bias detection improves experiment validity and decision accuracy.
Framework scalable to large online experimentation platforms.
Abstract
We have seen a massive growth of online experiments at LinkedIn, and in industry at large. It is now more important than ever to create an intelligent A/B platform that can truly democratize A/B testing by allowing everyone to make quality decisions, regardless of their skillset. With the tremendous knowledge base created around experimentation, we are able to mine through historical data, and discover the most common causes for biased experiments. In this paper, we share four of such common causes, and how we build into our A/B testing platform the automatic detection and diagnosis of such root causes. These root causes range from design-imposed bias, self-selection bias, novelty effect and trigger-day effect. We will discuss in detail what each bias is and the scalable algorithm we developed to detect the bias. Surfacing up the existence and root cause of bias automatically for every…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Statistical Methods in Clinical Trials · Online Learning and Analytics
