Testing for equal correlation matrices with application to paired gene expression data
Adria Caballe, Natalia Bochkina, Claus Mayer, Ioannis Papastathopoulos

TL;DR
This paper introduces a new statistical method for testing the equality of correlation matrices in high-dimensional paired datasets, with applications in gene expression analysis to identify biological pathway differences between healthy and tumor samples.
Contribution
The paper develops novel test statistics based on Fisher transform correlations and derives their null distributions, advancing methods for high-dimensional correlation matrix comparison.
Findings
Significant differences in gene pathway correlation matrices between healthy and tumor samples.
The proposed tests effectively detect changes in correlation structures in high-dimensional data.
Application to colorectal cancer data reveals biologically relevant pathway alterations.
Abstract
We present a novel method for testing the hypothesis of equality of two correlation matrices using paired high-dimensional datasets. We consider test statistics based on the average of squares, maximum and sum of exceedances of Fisher transform sample correlations and we derive approximate null distributions using asymptotic and non-parametric distributions. Theoretical results on the power of the tests are presented and backed up by a range of simulation experiments. We apply the methodology to a case study of colorectal tumour gene expression data with the aim of discovering biological pathway lists of genes that present significantly different correlation matrices on healthy and tumour samples. We find strong evidence for a large part of the pathway lists correlation matrices to change among the two medical conditions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods and Inference · Bioinformatics and Genomic Networks
