RAPTT: An Exact Two-Sample Test in High Dimensions Using Random Projections
Radhendushka Srivastava, Ping Li, David Ruppert

TL;DR
RAPTT introduces an exact, projection-based two-sample test for high-dimensional data that maintains power without restrictions on data or sample size, outperforming existing methods in simulations and gene expression analysis.
Contribution
It proposes RAPTT, a novel random projection-based test for high-dimensional mean comparison that is exact and does not require covariance matrix constraints.
Findings
RAPTT has higher power than competing tests in high dimensions.
The test is effective on real gene expression data.
RAPTT is computationally feasible for large datasets.
Abstract
In high dimensions, the classical Hotelling's test tends to have low power or becomes undefined due to singularity of the sample covariance matrix. In this paper, this problem is overcome by projecting the data matrix onto lower dimensional subspaces through multiplication by random matrices. We propose RAPTT (RAndom Projection T-Test), an exact test for equality of means of two normal populations based on projected lower dimensional data. RAPTT does not require any constraints on the dimension of the data or the sample size. A simulation study indicates that in high dimensions the power of this test is often greater than that of competing tests. The advantage of RAPTT is illustrated on high-dimensional gene expression data involving the discrimination of tumor and normal colon tissues.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Gene expression and cancer classification · Statistical Methods and Inference
