High dimensional PCA: a new model selection criterion
Abhinav Chakraborty, Soumendu Sundar Mukherjee, Arijit, Chakrabarti

TL;DR
This paper investigates the use of modified AIC criteria for consistent model selection in high-dimensional PCA, especially when the eigenvalue gap is small or zero, improving estimation accuracy.
Contribution
It introduces new penalty modifications to AIC that achieve strong consistency even with minimal or zero eigenvalue gaps in high-dimensional settings.
Findings
Modified AIC can achieve strong consistency with arbitrarily small gaps.
New estimators outperform existing methods in simulation studies.
Calibrated penalties significantly reduce mean-squared error.
Abstract
Given a random sample from a multivariate population, estimating the number of large eigenvalues of the population covariance matrix is an important problem in Statistics with wide applications in many areas. In the context of Principal Component Analysis (PCA), the linear combinations of the original variables having the largest amounts of variation are determined by this number. In this paper, we study the high dimensional asymptotic regime where the number of variables grows at the same rate as the number of observations, and use the spiked covariance model proposed in Johnstone (2001), under which the problem reduces to model selection. Our focus is on the Akaike Information Criterion (AIC) which is known to be strongly consistent from the work of Bai et al. (2018). However, Bai et al. (2018) requires a certain "gap condition" ensuring the dominant eigenvalues to be above a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Statistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models
