Treatment Effect Detection with Controlled FDR under Dependence for Large-Scale Experiments
Yihan Bao, Shichao Han, Yong Wang

TL;DR
This paper introduces new statistical methods for large-scale A/B testing that improve the detection power of treatment effects while reliably controlling the false discovery rate under dependence, with validation on simulations and real data.
Contribution
It proposes robust, scalable methods for identifying average treatment effects with controlled FDR, surpassing traditional BH limitations under dependence.
Findings
Methods achieve higher power than BH under dependence.
Validated on both simulated and real-world data.
Compared favorably with recent FDR control techniques.
Abstract
Online controlled experiments (also known as A/B Testing) have been viewed as a golden standard for large data-driven companies since the last few decades. The most common A/B testing framework adopted by many companies use "average treatment effect" (ATE) as statistics. However, it remains a difficult problem for companies to improve the power of detecting ATE while controlling "false discovery rate" (FDR) at a predetermined level. One of the most popular FDR-control algorithms is BH method, but BH method is only known to control FDR under restrictive positive dependence assumptions with a conservative bound. In this paper, we propose statistical methods that can systematically and accurately identify ATE, and demonstrate how they can work robustly with controlled low FDR but a higher power using both simulation and real-world experimentation data. Moreover, we discuss the scalability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Advanced Causal Inference Techniques · Optimal Experimental Design Methods
