Random selection of factors preserves the correlation structure in a linear factor model to a high degree
Antti J. Tanskanen, Jani Lukkarinen, Kari Vatanen

TL;DR
This paper introduces a random factor model in high-dimensional spaces that preserves the correlation structure of data with high probability, using random projections to efficiently approximate covariance matrices.
Contribution
The paper proposes a novel random factor model leveraging random projections, providing probabilistic bounds for covariance and correlation preservation in high-dimensional data.
Findings
Covariance matrices are well preserved by the random factor model.
The model accurately reproduces time-series and cross-correlation coefficients.
Application to Russell 3000 index demonstrates practical effectiveness.
Abstract
In a very high-dimensional vector space, two randomly-chosen vectors are almost orthogonal with high probability. Starting from this observation, we develop a statistical factor model, the random factor model, in which factors are chosen at random based on the random projection method. Randomness of factors has the consequence that covariance matrix is well preserved in a linear factor representation. It also enables derivation of probabilistic bounds for the accuracy of the random factor representation of time-series, their cross-correlations and covariances. As an application, we analyze reproduction of time-series and their cross-correlation coefficients in the well-diversified Russell 3,000 equity index.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
