Distributions associated with simultaneous multiple hypothesis testing

Chang Yu; Daniel Zelterman

arXiv:1802.09018·stat.ME·February 27, 2018

Distributions associated with simultaneous multiple hypothesis testing

Chang Yu, Daniel Zelterman

PDF

Open Access

TL;DR

This paper derives the distribution of significant hypotheses identified by FDR control procedures, introduces a parametric p-value distribution model, and demonstrates its application in cancer studies with dependence modeling for power analysis.

Contribution

It provides a new distributional framework for understanding the number of discoveries in multiple testing, including dependence and non-uniform alternative hypotheses.

Findings

01

Distribution of significant hypotheses approximates a mixture of normal and Borel-Tanner distributions.

02

The proposed parametric distribution fits p-value data from cancer studies.

03

Dependence among p-values can be modeled with copulas and latent variables.

Abstract

We develop the distribution of the number of hypotheses found to be statistically significant using the rule from Benjamini and Hochberg (1995) for controlling the false discovery rate (FDR). This distribution has both a small sample form and an asymptotic expression for testing many independent hypotheses simultaneously. We propose a parametric distribution $Ψ_{I} (\cdot)$ to approximate the marginal distribution of p-values under a non-uniform alternative hypothesis. This distribution is useful when there are many different alternative hypotheses and these are not individually well understood. We fit $Ψ_{I}$ to data from three cancer studies and use it to illustrate the distribution of the number of notable hypotheses observed in these examples. We model dependence of sampled p-values using a copula model and a latent variable approach. These methods can be combined to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Statistical Methods and Bayesian Inference · Optimal Experimental Design Methods