Optimal Discriminant Analysis in High-Dimensional Latent Factor Models
Xin Bing, Marten Wegkamp

TL;DR
This paper introduces an optimal high-dimensional classification method based on principal components within a latent-variable model, providing theoretical guarantees and demonstrating superior empirical performance.
Contribution
It formulates a latent-variable model to justify PCA-based classification, deriving optimal convergence rates and offering a data-driven approach for selecting the number of components.
Findings
Derived explicit convergence rates for the PC-based classifier.
Proved the rates are minimax optimal up to logarithmic factors.
Demonstrated superior performance over existing methods in simulations and real data.
Abstract
In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space, and base the classification on the resulting lower dimensional projections. In this paper, we formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure and to guide which projection to choose. We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections, with the number of retained PCs selected in a data-driven way. A general theory is established for analyzing such two-step classifiers based on any projections. We derive explicit rates of convergence of the excess risk of the proposed PC-based classifier. The obtained rates are further shown to be optimal up to logarithmic factors in the minimax sense. Our theory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Face and Expression Recognition · Bayesian Methods and Mixture Models
MethodsBalanced Selection
