Distributions and Statistical Power of Optimal Signal-Detection Methods In Finite Cases
Hong Zhang, Jiashun Jin, Zheyang Wu

TL;DR
This paper develops analytical methods to compute p-values and power for a broad family of goodness-of-fit tests, enabling practical application of optimal signal-detection methods in finite-sample scenarios, especially in genomics.
Contribution
It provides a general analytical framework for calculating p-values and power of GOF tests, including HC and B-J, in finite samples, bridging the gap between asymptotic theory and practical data analysis.
Findings
HC performs best for rare signals.
B-J is more robust across different signal patterns.
The R package SetTest facilitates practical application.
Abstract
In big data analysis for detecting rare and weak signals among features, some grouping-test methods such as Higher Criticism test (HC), Berk-Jones test (B-J), and -divergence test share the similar asymptotical optimality when . However, in practical data analysis is frequently small and moderately large at most. In order to properly apply these optimal tests and wisely choose them for practical studies, it is important to know how to get the p-values and statistical power of them. To address this problem in an even broader context, this paper provides analytical solutions for a general family of goodness-of-fit (GOF) tests, which covers these optimal tests. For any given i.i.d. and continuous distributions of the input test statistics of the features, both p-value and statistical power of such a GOF test can be calculated. By calculation we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetic Associations and Epidemiology · Bayesian Methods and Mixture Models
