A $U$-classifier for high-dimensional data under non-normality
M. Rauf Ahmad, Tatjana Pavlenko

TL;DR
This paper introduces a new high-dimensional classifier combining U-statistics and discriminant scores, effective under non-normal distributions and small sample sizes, with proven asymptotic properties and demonstrated accuracy.
Contribution
It proposes a novel bias-adjusted linear classifier for high-dimensional, non-normal data, with theoretical analysis and practical validation.
Findings
Accurately classifies high-dimensional data with small samples.
The classifier's asymptotic normality is established.
Effective on real-world datasets with non-normal distributions.
Abstract
A classifier for two or more samples is proposed when the data are high-dimensional and the underlying distributions may be non-normal. The classifier is constructed as a linear combination of two easily computable and interpretable components, the -component and the -component. The -component is a linear combination of -statistics which are averages of bilinear forms of pairwise distinct vectors from two independent samples. The -component is the discriminant score and is a function of the projection of the -component on the observation to be classified. Combined, the two components constitute an inherently bias-adjusted classifier valid for high-dimensional data. The simplicity of the classifier helps conveniently study its properties, including its asymptotic normal limit, and extend it to multi-sample case. The classifier is linear but its linearity does not rest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Neural Networks and Applications · Advanced Statistical Methods and Models
