Covariance Matrix Estimation for High-Throughput Biomedical Data with Interconnected Communities
Yifan Yang, Chixiang Chen, Shuo Chen

TL;DR
This paper introduces a novel covariance matrix estimation method that exploits the interconnected community structure in high-dimensional biomedical data, improving accuracy over existing methods through theoretical and empirical validation.
Contribution
It develops a new estimator leveraging community structures in biomedical data, with closed-form solutions and proven asymptotic properties, enhancing covariance matrix estimation accuracy.
Findings
The proposed estimator outperforms existing methods in simulations.
The method achieves more accurate covariance estimates in real biomedical datasets.
Theoretical analysis confirms the estimator's optimality and asymptotic properties.
Abstract
Estimating a covariance matrix is central to high-dimensional data analysis. Empirical analyses of high-dimensional biomedical data, including genomics, proteomics, microbiome, and neuroimaging, among others, consistently reveal strong modularity in the dependence patterns. In these analyses, intercorrelated high-dimensional biomedical features often form communities or modules that can be interconnected with others. While the interconnected community structure has been extensively studied in biomedical research (e.g., gene co-expression networks), its potential to assist in the estimation of covariance matrices remains largely unexplored. To address this gap, we propose a procedure that leverages the commonly observed interconnected community structure in high-dimensional biomedical data to estimate large covariance and precision matrices. We derive the uniformly minimum-variance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Complex Network Analysis Techniques
