Rank-transformed subsampling: inference for multiple data splitting and   exchangeable p-values

F. Richard Guo; Rajen D. Shah

arXiv:2301.02739·stat.ME·September 5, 2024·1 cites

Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values

F. Richard Guo, Rajen D. Shah

PDF

Open Access 1 Repo

TL;DR

This paper introduces rank-transformed subsampling, a novel method for combining p-values from multiple data splits that improves power and controls type-I error across various testing scenarios.

Contribution

The authors develop a general rank-transformed subsampling approach for large sample inference, addressing drawbacks of randomised tests like power loss and inconsistent results.

Findings

01

Controls type-I error asymptotically at the nominal level

02

Reduces bias and increases power compared to ordinary subsampling

03

Applicable to diverse testing problems including high-dimensional and sequential data

Abstract

Many testing problems are readily amenable to randomised tests such as those employing data splitting. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may lead to different results. Secondly, the test typically loses power because it does not fully utilise the entire sample. As a remedy to these drawbacks, we study how to combine the test statistics or p-values resulting from multiple random realisations such as through random data splits. We develop rank-transformed subsampling as a general method for delivering large sample inference about the combined statistic or p-value under mild assumptions. We apply our methodology to a wide range of problems, including testing unimodality in high-dimensional data, testing goodness-of-fit of parametric quantile regression models, testing no direct effect in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

richardkwo/multisplit
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Causal Inference Techniques · Statistical Methods and Bayesian Inference