Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values
F. Richard Guo, Rajen D. Shah

TL;DR
This paper introduces rank-transformed subsampling, a novel method for combining p-values from multiple data splits that improves power and controls type-I error across various testing scenarios.
Contribution
The authors develop a general rank-transformed subsampling approach for large sample inference, addressing drawbacks of randomised tests like power loss and inconsistent results.
Findings
Controls type-I error asymptotically at the nominal level
Reduces bias and increases power compared to ordinary subsampling
Applicable to diverse testing problems including high-dimensional and sequential data
Abstract
Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may lead to different results. Secondly, the test typically loses power because it does not fully utilise the entire sample. As a remedy to these drawbacks, we study how to combine the test statistics or p-values resulting from multiple random realisations such as through random data splits. We develop rank-transformed subsampling as a general method for delivering large sample inference about the combined statistic or p-value under mild assumptions. We apply our methodology to a wide range of problems, including testing unimodality in high-dimensional data, testing goodness-of-fit of parametric quantile regression models, testing no direct effect in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Causal Inference Techniques · Statistical Methods and Bayesian Inference
