Embracing the Blessing of Dimensionality in Factor Models
Quefeng Li, Guang Cheng, Jianqing Fan, Yuyan Wang

TL;DR
This paper advocates for leveraging all available high-dimensional data in factor models to improve covariance matrix estimation, providing theoretical conditions, algorithms, and empirical evidence for the benefits of using more data.
Contribution
It introduces conditions under which additional data improves covariance estimation, proposes a divide-and-conquer algorithm, and demonstrates empirical benefits in high-dimensional settings.
Findings
Using more data enhances covariance estimation accuracy.
The divide-and-conquer algorithm maintains statistical accuracy.
Empirical results confirm the advantages of full data utilization.
Abstract
Factor modeling is an essential tool for exploring intrinsic dependence structures among high-dimensional random variables. Much progress has been made for estimating the covariance matrix from a high-dimensional factor model. However, the blessing of dimensionality has not yet been fully embraced in the literature: much of the available data is often ignored in constructing covariance matrix estimates. If our goal is to accurately estimate a covariance matrix of a set of targeted variables, shall we employ additional data, which are beyond the variables of interest, in the estimation? In this paper, we provide sufficient conditions for an affirmative answer, and further quantify its gain in terms of Fisher information and convergence rate. In fact, even an oracle-like result (as if all the factors were known) can be achieved when a sufficiently large number of variables is used. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Statistical Methods and Inference · Complex Network Analysis Techniques
