Distributions and Statistical Power of Optimal Signal-Detection Methods   In Finite Cases

Hong Zhang; Jiashun Jin; Zheyang Wu

arXiv:1702.07082·math.ST·February 24, 2017·1 cites

Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases

Hong Zhang, Jiashun Jin, Zheyang Wu

PDF

Open Access

TL;DR

This paper develops analytical methods to compute p-values and power for a broad family of goodness-of-fit tests, enabling practical application of optimal signal-detection methods in finite-sample scenarios, especially in genomics.

Contribution

It provides a general analytical framework for calculating p-values and power of GOF tests, including HC and B-J, in finite samples, bridging the gap between asymptotic theory and practical data analysis.

Findings

01

HC performs best for rare signals.

02

B-J is more robust across different signal patterns.

03

The R package SetTest facilitates practical application.

Abstract

In big data analysis for detecting rare and weak signals among $n$ features, some grouping-test methods such as Higher Criticism test (HC), Berk-Jones test (B-J), and $ϕ$ -divergence test share the similar asymptotical optimality when $n \to \infty$ . However, in practical data analysis $n$ is frequently small and moderately large at most. In order to properly apply these optimal tests and wisely choose them for practical studies, it is important to know how to get the p-values and statistical power of them. To address this problem in an even broader context, this paper provides analytical solutions for a general family of goodness-of-fit (GOF) tests, which covers these optimal tests. For any given i.i.d. and continuous distributions of the input test statistics of the $n$ features, both p-value and statistical power of such a GOF test can be calculated. By calculation we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene expression and cancer classification · Genetic Associations and Epidemiology · Bayesian Methods and Mixture Models