Finding Distributions that Differ, with False Discovery Rate Control

Yonghoon Lee; Edgar Dobriban; and Eric Tchetgen Tchetgen

arXiv:2505.13769·stat.ME·November 26, 2025

Finding Distributions that Differ, with False Discovery Rate Control

Yonghoon Lee, Edgar Dobriban, and Eric Tchetgen Tchetgen

PDF

Open Access

TL;DR

This paper introduces a distribution-free multiple testing method with false discovery rate control for comparing a reference distribution against multiple others, demonstrated through simulations and real datasets.

Contribution

It develops a novel batch conformal p-value approach that ensures exact FDR control under dependence, with applications to real-world data analysis.

Findings

01

Method achieves FDR control in simulations

02

Comparable performance to distribution-specific methods

03

Identifies significant sub-populations in real datasets

Abstract

We consider the problem of comparing a reference distribution with several other distributions. Given a sample from both the reference and the comparison groups, we aim to identify the comparison groups whose distributions differ from that of the reference group. Viewing this as a multiple testing problem, we introduce a methodology that provides exact, distribution-free control of the false discovery rate. To do so, we introduce the concept of batch conformal p-values and demonstrate that they satisfy positive regression dependence across the groups [Benjamini and Yekutieli, 2001], thereby enabling control of the false discovery rate through the Benjamini-Hochberg procedure. The proof of positive regression dependence introduces a novel technique for the inductive construction of rank vectors with almost sure dominance under exchangeability. We evaluate the performance of the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Process Monitoring