A unified framework for correlation mining in ultra-high dimension
Yun Wei, Bala Rajaratnam, Alfred O. Hero

TL;DR
This paper introduces a unified framework for correlation and partial correlation mining in ultra-high dimensional data, overcoming previous limitations related to sparsity assumptions and computational challenges, with theoretical insights connecting to random geometric graphs.
Contribution
It proposes a general methodology not restricted to block diagonal structures, linking correlation screening to random geometric graphs, and establishing a duality between correlation and partial correlation screening.
Findings
Finite-sample compound Poisson characterizations for highly correlated variables.
Framework applicable to both finite and infinite dimensional settings.
Demonstrates duality between correlation and partial correlation screening.
Abstract
Many applications benefit from theory relevant to the identification of variables having large correlations or partial correlations in high dimension. Recently there has been progress in the ultra-high dimensional setting when the sample size is fixed and the dimension tends to infinity. Despite these advances, the correlation screening framework suffers from practical, methodological and theoretical deficiencies. For instance, previous correlation screening theory requires that the population covariance matrix be sparse and block diagonal. This block sparsity assumption is however restrictive in practical applications. As a second example, correlation and partial correlation screening requires the estimation of dependence measures, which can be computationally prohibitive. In this paper, we propose a unifying approach to correlation and partial correlation mining that is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Topological and Geometric Data Analysis · Bayesian Methods and Mixture Models
