Large-scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle

Yutong Nie; Yihong Wu

arXiv:2302.06809·math.ST·September 3, 2025

Large-scale Multiple Testing: Fundamental Limits of False Discovery Rate Control and Compound Oracle

Yutong Nie, Yihong Wu

PDF

Open Access

TL;DR

This paper characterizes the fundamental limits of the tradeoff between false discovery rate and false non-discovery rate in large-scale multiple testing, revealing the necessity of complex decision rules for optimal performance.

Contribution

It establishes the asymptotic optimal FDR-FNR tradeoff under the two-group model and shows that optimal rules are inherently compound, not separable, even in simple Gaussian models.

Findings

01

Optimal FDR-FNR tradeoff derived for large-scale testing.

02

Separable rules are suboptimal compared to compound rules.

03

High-probability FDP control aligns with marginal FDR and FNR tradeoffs.

Abstract

The false discovery rate (FDR) and the false non-discovery rate (FNR), defined as the expected false discovery proportion (FDP) and the false non-discovery proportion (FNP), are the most popular benchmarks for multiple testing. Despite the theoretical and algorithmic advances in recent years, the optimal tradeoff between the FDR and the FNR has been largely unknown except for certain restricted classes of decision rules, e.g., separable rules, or for other performance metrics, e.g., the marginal FDR and the marginal FNR (mFDR and mFNR). In this paper, we determine the asymptotically optimal FDR-FNR tradeoff under the two-group random mixture model when the number of hypotheses tends to infinity. Distinct from the optimal mFDR-mFNR tradeoff, which is achieved by separable decision rules, the optimal FDR-FNR tradeoff requires compound rules even in the large-sample limit and for models as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Data Quality and Management · Statistical Methods in Clinical Trials