PlanAlyzer: Assessing Threats to the Validity of Online Experiments

Emma Tosch; Eytan Bakshy; Emery D. Berger; David D. Jensen; J. Eliot; B. Moss

arXiv:1909.13649·cs.PL·October 1, 2019

PlanAlyzer: Assessing Threats to the Validity of Online Experiments

Emma Tosch, Eytan Bakshy, Emery D. Berger, David D. Jensen, J. Eliot, B. Moss

PDF

TL;DR

PlanAlyzer is a novel static analysis tool that automatically checks the internal validity of online experiments specified in the PlanOut framework, helping ensure trustworthy experimental results at scale.

Contribution

It introduces the first static checking approach for internal validity in online experiments, specifically targeting the PlanOut framework and automating threat detection and contrast generation.

Findings

01

Achieves 92% precision and recall on mutated datasets.

02

Automatically generates valid contrasts matching manual specifications.

03

Effectively identifies threats to internal validity in real-world Facebook experiments.

Abstract

Online experiments are ubiquitous. As the scale of experiments has grown, so has the complexity of their design and implementation. In response, firms have developed software frameworks for designing and deploying online experiments. Ensuring that experiments in these frameworks are correctly designed and that their results are trustworthy---referred to as *internal validity*---can be difficult. Currently, verifying internal validity requires manual inspection by someone with substantial expertise in experimental design. We present the first approach for statically checking the internal validity of online experiments. Our checks are based on well-known problems that arise in experimental design and causal inference. Our analyses target PlanOut, a widely deployed, open-source experimentation framework that uses a domain-specific language to specify and run complex experiments. We have…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.