Statistical applications of Random matrix theory: comparison of two populations III
R\'emy Mari\'etan, Stephan Morgenthaler

TL;DR
This paper develops a statistical test based on the largest eigenvalues of covariance matrices, inspired by random matrix theory, to compare two large, potentially dependent datasets, improving accuracy and robustness.
Contribution
It extends previous work on eigenvalue perturbation and residual spikes, introducing a new test statistic that enhances the comparison of two covariance matrices in high-dimensional settings.
Findings
The new test accurately detects differences in large covariance matrices.
Simulation results demonstrate robustness without strict assumptions.
The method improves upon previous eigenvalue-based tests.
Abstract
This paper investigates a statistical procedure for testing the equality of two independently estimated covariance matrices when the number of potentially dependent data vectors is large and proportional to the size of the vectors, that is, the number of variables. Inspired by the spike models used in random matrix theory, we concentrate on the largest eigenvalues of the matrices in order to determine significant differences. To avoid false rejections we must guard against residual spikes and need a sufficiently precise description of the properties of the largest eigenvalues under the null hypothesis. In this paper, we extend arXiv:2002.12741 for perturbation of order and arXiv:2002.12703 studying simpler statistic. The residual spike introduce in the first paper is investigated and leads to a statistic that results in a good test of equality of two populations. Simulations show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications
