Cleaning large-dimensional covariance matrices for correlated samples
Zdzislaw Burda, Andrzej Jarosz

TL;DR
This paper addresses the challenge of estimating large covariance matrices with correlated samples by extending theoretical models, developing an efficient algorithm, and providing an open-source Python library for practical implementation.
Contribution
It generalizes the Marcenko-Pastur equation and Ledoit-Peche estimator for correlated samples, introducing an efficient algorithm and an open-source Python library.
Findings
Extended theoretical models for correlated samples
Developed an efficient estimation algorithm
Provided a practical Python library for covariance estimation
Abstract
We elucidate the problem of estimating large-dimensional covariance matrices in the presence of correlations between samples. To this end, we generalize the Marcenko-Pastur equation and the Ledoit-Peche shrinkage estimator using methods of random matrix theory and free probability. We develop an efficient algorithm that implements the corresponding analytic formulas, based on the Ledoit-Wolf kernel estimation technique. We also provide an associated open-source Python library, called "shrinkage", with a user-friendly API to assist in practical tasks of estimation of large covariance matrices. We present an example of its usage for synthetic data generated according to exponentially-decaying auto-correlations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications
