A leave-p-out based estimation of the proportion of null hypotheses

Alain Celisse; St\'ephane Robin

arXiv:0804.1189·math.ST·December 18, 2008·4 cites

A leave-p-out based estimation of the proportion of null hypotheses

Alain Celisse, St\'ephane Robin

PDF

Open Access

TL;DR

This paper introduces a new density-based estimator for the proportion of true null hypotheses in multiple testing, improving accuracy and power of FDR control procedures.

Contribution

It proposes a novel leave-p-out density estimation method for $\

Findings

01

Estimator outperforms existing methods in MSE across simulations.

02

Plug-in FDR procedure achieves asymptotic control and higher power.

03

Method is effective under independence assumptions.

Abstract

In the multiple testing context, a challenging problem is the estimation of the proportion $π_{0}$ of true-null hypotheses. A large number of estimators of this quantity rely on identifiability assumptions that either appear to be violated on real data, or may be at least relaxed. Under independence, we propose an estimator $\overset{π}{^}_{0}$ based on density estimation using both histograms and cross-validation. Due to the strong connection between the false discovery rate (FDR) and $π_{0}$ , many multiple testing procedures (MTP) designed to control the FDR may be improved by introducing an estimator of $π_{0}$ . We provide an example of such an improvement (plug-in MTP) based on the procedure of Benjamini and Hochberg. Asymptotic optimality results may be derived for both $\overset{π}{^}_{0}$ and the resulting plug-in procedure. The latter ensures the desired asymptotic control of the FDR, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Statistical Methods and Bayesian Inference · Optimal Experimental Design Methods