Finding Distributions that Differ, with False Discovery Rate Control
Yonghoon Lee, Edgar Dobriban, and Eric Tchetgen Tchetgen

TL;DR
This paper introduces a distribution-free multiple testing method with false discovery rate control for comparing a reference distribution against multiple others, demonstrated through simulations and real datasets.
Contribution
It develops a novel batch conformal p-value approach that ensures exact FDR control under dependence, with applications to real-world data analysis.
Findings
Method achieves FDR control in simulations
Comparable performance to distribution-specific methods
Identifies significant sub-populations in real datasets
Abstract
We consider the problem of comparing a reference distribution with several other distributions. Given a sample from both the reference and the comparison groups, we aim to identify the comparison groups whose distributions differ from that of the reference group. Viewing this as a multiple testing problem, we introduce a methodology that provides exact, distribution-free control of the false discovery rate. To do so, we introduce the concept of batch conformal p-values and demonstrate that they satisfy positive regression dependence across the groups [Benjamini and Yekutieli, 2001], thereby enabling control of the false discovery rate through the Benjamini-Hochberg procedure. The proof of positive regression dependence introduces a novel technique for the inductive construction of rank vectors with almost sure dominance under exchangeability. We evaluate the performance of the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring
