False discovery rate control with unknown null distribution: is it possible to mimic the oracle?
Etienne Roquain, Nicolas Verzelen

TL;DR
This paper investigates the possibility of controlling the false discovery rate when the null distribution is unknown, establishing theoretical limits and proposing methods to approximate the oracle procedure in large-scale testing scenarios.
Contribution
It provides theoretical conditions under which the oracle null distribution can be mimicked and develops confidence regions for the null, including practical goodness-of-fit tests.
Findings
An asymptotic mimicry of the oracle is possible if the sparsity is less than n/log(n).
Impossibility results for general null shape models beyond Gaussian.
Development of confidence regions and goodness-of-fit tests for the null distribution.
Abstract
Classical multiple testing theory prescribes the null distribution, which is often a too stringent assumption for nowadays large scale experiments. This paper presents theoretical foundations to understand the limitations caused by ignoring the null distribution, and how it can be properly learned from the (same) data-set, when possible. We explore this issue in the case where the null distributions are Gaussian with an unknown rescaling parameters (mean and variance) and the alternative distribution is let arbitrary. While an oracle procedure in that case is the Benjamini Hochberg procedure applied with the true (unknown) null distribution, we pursue the aim of building a procedure that asymptotically mimics the performance of the oracle (AMO in short). Our main result states that an AMO procedure exists if and only if the sparsity parameter (number of false nulls) is of order less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
