Permutation p-value approximation via generalized Stolarsky invariance

Hera Yu He; Kinjal Basu; Qingyuan Zhao; Art B. Owen

arXiv:1603.02757·math.ST·August 10, 2017

Permutation p-value approximation via generalized Stolarsky invariance

Hera Yu He, Kinjal Basu, Qingyuan Zhao, Art B. Owen

PDF

TL;DR

This paper introduces a fast, geometry-based approximation method for permutation p-values in genomic data analysis, improving efficiency and accuracy over existing saddlepoint methods.

Contribution

It develops a novel approximation for permutation p-values using Stolarsky's invariance principle, with variance estimation, applicable to two-sample linear test statistics.

Findings

01

The method provides accurate p-value estimates with modest variance.

02

It outperforms saddlepoint approximations in speed and accuracy on Parkinson's data.

03

The approach offers a probabilistic interpretation of Stolarsky's invariance principle.

Abstract

It is common for genomic data analysis to use $p$ -values from a large number of permutation tests. The multiplicity of tests may require very tiny $p$ -values in order to reject any null hypotheses and the common practice of using randomly sampled permutations then becomes very expensive. We propose an inexpensive approximation to $p$ -values for two sample linear test statistics, derived from Stolarsky's invariance principle. The method creates a geometrically derived set of approximate $p$ -values for each hypothesis. The average of that set is used as a point estimate $\overset{p}{^}$ and our generalization of the invariance principle allows us to compute the variance of the $p$ -values in that set. We find that in cases where the point estimate is small the variance is a modest multiple of the square of the point estimate, yielding a relative error property similar to that of saddlepoint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.