Randomized Experimental Design via Geographic Clustering
David Rolnick, Kevin Aydin, Jean Pouget-Abadie, Shahab Kamali, Vahab, Mirrokni, Amir Najmi

TL;DR
This paper introduces GeoCUTS, a novel algorithm for creating geographical clusters to minimize interference in web experimentations, validated through empirical analysis on Google Search data.
Contribution
The paper presents GeoCUTS, a new clustering algorithm that reduces interference in geographic experiments and a statistical framework to evaluate clustering effectiveness.
Findings
GeoCUTS performs comparably to hand-crafted regions.
The clustering minimizes user movement interference.
The framework effectively measures clustering quality.
Abstract
Web-based services often run randomized experiments to improve their products. A popular way to run these experiments is to use geographical regions as units of experimentation, since this does not require tracking of individual users or browser cookies. Since users may issue queries from multiple geographical locations, geo-regions cannot be considered independent and interference may be present in the experiment. In this paper, we study this problem, and first present GeoCUTS, a novel algorithm that forms geographical clusters to minimize interference while preserving balance in cluster size. We use a random sample of anonymized traffic from Google Search to form a graph representing user movements, then construct a geographically coherent clustering of the graph. Our main technical contribution is a statistical framework to measure the effectiveness of clusterings. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
