A $U$-classifier for high-dimensional data under non-normality

M. Rauf Ahmad; Tatjana Pavlenko

arXiv:1608.00088·math.ST·August 2, 2016·J. Multivar. Anal.

A $U$-classifier for high-dimensional data under non-normality

M. Rauf Ahmad, Tatjana Pavlenko

PDF

Open Access

TL;DR

This paper introduces a new high-dimensional classifier combining U-statistics and discriminant scores, effective under non-normal distributions and small sample sizes, with proven asymptotic properties and demonstrated accuracy.

Contribution

It proposes a novel bias-adjusted linear classifier for high-dimensional, non-normal data, with theoretical analysis and practical validation.

Findings

01

Accurately classifies high-dimensional data with small samples.

02

The classifier's asymptotic normality is established.

03

Effective on real-world datasets with non-normal distributions.

Abstract

A classifier for two or more samples is proposed when the data are high-dimensional and the underlying distributions may be non-normal. The classifier is constructed as a linear combination of two easily computable and interpretable components, the $U$ -component and the $P$ -component. The $U$ -component is a linear combination of $U$ -statistics which are averages of bilinear forms of pairwise distinct vectors from two independent samples. The $P$ -component is the discriminant score and is a function of the projection of the $U$ -component on the observation to be classified. Combined, the two components constitute an inherently bias-adjusted classifier valid for high-dimensional data. The simplicity of the classifier helps conveniently study its properties, including its asymptotic normal limit, and extend it to multi-sample case. The classifier is linear but its linearity does not rest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Neural Networks and Applications · Advanced Statistical Methods and Models