Sparse linear discriminant analysis by thresholding for high dimensional   data

Jun Shao; Yazhen Wang; Xinwei Deng; Sijian Wang

arXiv:1105.3561·math.ST·May 19, 2011

Sparse linear discriminant analysis by thresholding for high dimensional data

Jun Shao, Yazhen Wang, Xinwei Deng, Sijian Wang

PDF

TL;DR

This paper introduces a sparse linear discriminant analysis method tailored for high-dimensional data, improving classification accuracy when the number of variables exceeds the sample size.

Contribution

It proposes a novel sparse LDA approach that is asymptotically optimal under sparsity conditions, addressing limitations of traditional LDA in high-dimensional settings.

Findings

01

Method performs well in simulations

02

Effective in classifying high-dimensional biological data

03

Outperforms traditional LDA in large variable scenarios

Abstract

In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.