Classification of high-dimensional data with spiked covariance matrix structure
Yin-Jen Chen, Minh Tang

TL;DR
This paper introduces an adaptive classification method for high-dimensional data with spiked covariance matrices, combining dimension reduction, feature screening, and Fisher discriminant analysis, achieving near-optimal performance.
Contribution
The paper proposes a novel adaptive classifier that is theoretically proven to be Bayes optimal under certain high-dimensional conditions, extending to quadratic discriminant analysis.
Findings
Classifier is Bayes optimal as sample size grows large.
Method performs competitively with state-of-the-art techniques.
Operates effectively on significantly reduced data dimensions.
Abstract
We study the classification problem for high-dimensional data with observations on features where the covariance matrix exhibits a spiked eigenvalue structure and the vector , given by the difference between the {\em whitened} mean vectors, is sparse. We analyze an adaptive classifier (adaptive with respect to the sparsity ) that first performs dimension reduction on the feature vectors prior to classification in the dimensionally reduced space, i.e., the classifier whitens the data, then screens the features by keeping only those corresponding to the largest coordinates of and finally applies Fisher linear discriminant on the selected features. Leveraging recent results on entrywise matrix perturbation bounds for covariance matrices, we show that the resulting classifier is Bayes optimal whenever and $s…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Random Matrices and Applications · Statistical Methods and Inference
