False Discovery Rate Controlled Heterogeneous Treatment Effect Detection for Online Controlled Experiments
Yuxiang Xie, Nanyu Chen, Xiaolin Shi

TL;DR
This paper introduces statistical methods to detect and analyze heterogeneity in treatment effects across different user groups in online experiments, ensuring controlled false discovery rates and providing actionable insights.
Contribution
It presents novel methods for identifying heterogeneous treatment effects in A/B testing with controlled FDR, enabling more nuanced understanding of user-specific impacts.
Findings
Methods work robustly on simulation and real data
Toolkit deployed at Snap for large-scale experiments
Provides insights into user group heterogeneity
Abstract
Online controlled experiments (a.k.a. A/B testing) have been used as the mantra for data-driven decision making on feature changing and product shipping in many Internet companies. However, it is still a great challenge to systematically measure how every code or feature change impacts millions of users with great heterogeneity (e.g. countries, ages, devices). The most commonly used A/B testing framework in many companies is based on Average Treatment Effect (ATE), which cannot detect the heterogeneity of treatment effect on users with different characteristics. In this paper, we propose statistical methods that can systematically and accurately identify Heterogeneous Treatment Effect (HTE) of any user cohort of interest (e.g. mobile device type, country), and determine which factors (e.g. age, gender) of users contribute to the heterogeneity of the treatment effect in an A/B test. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
