permApprox: a general framework for accurate permutation p-value approximation
Stefanie Peschel, Anne-Laure Boulesteix, Erika von Mutius, Christian L. M\"uller

TL;DR
permApprox provides a robust, accurate, and zero-free permutation p-value approximation method that improves hypothesis testing reliability, especially with limited permutations, by enforcing support constraints and integrating tail modeling.
Contribution
It introduces a novel framework and R package that enforce support constraints during GPD tail modeling to prevent zero p-values in permutation testing.
Findings
permApprox yields accurate p-values across various tests and effect sizes.
It produces smooth, interpretable p-value distributions with few permutations.
The method is effective in high-dimensional, real-world data applications.
Abstract
Permutation procedures are common practice in hypothesis testing when distributional assumptions about the test statistic are not met or unknown. With only few permutations, empirical p-values lie on a coarse grid and may even be zero when the observed test statistic exceeds all permuted values. Such zero p-values are statistically invalid and hinder multiple testing correction. Parametric tail modeling with the Generalized Pareto Distribution (GPD) has been proposed to address this issue, but existing implementations can again yield zero p-values when the estimated shape parameter is negative and the fitted distribution has a finite upper bound. We introduce a method for accurate and zero-free p-value approximation in permutation testing, embedded in the permApprox workflow and R package. Building on GPD tail modeling, the method enforces a support constraint during parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Genetic Associations and Epidemiology · Gene expression and cancer classification
