Estimating the null distribution for conditional inference and genome-scale screening
David R. Bickel

TL;DR
This paper introduces a method for estimating the null distribution in large-scale genomic testing, improving conditional inference and decision-making by quantifying the benefits of conditioning on the estimated null.
Contribution
It proposes a novel approach to estimate the null distribution for genome-scale testing and introduces an information-theoretic score to guide inference decisions.
Findings
Conditioning on the estimated null improves inference accuracy.
The score quantifies the trade-off between ancillarity and relevance.
Applications demonstrate practical benefits in gene expression analysis.
Abstract
In a novel approach to the multiple testing problem, Efron (2004; 2007) formulated estimators of the distribution of test statistics or nominal p-values under a null distribution suitable for modeling the data of thousands of unaffected genes, non-associated single-nucleotide polymorphisms, or other biological features. Estimators of the null distribution can improve not only the empirical Bayes procedure for which it was originally intended, but also many other multiple comparison procedures. Such estimators serve as the groundwork for the proposed multiple comparison procedure based on a recent frequentist method of minimizing posterior expected loss, exemplified with a non-additive loss function designed for genomic screening rather than for validation. The merit of estimating the null distribution is examined from the vantage point of conditional inference in the remainder of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
