The classification for High-dimension low-sample size data

Liran Shen; Meng Joo Er; Qingbo Yin

arXiv:2006.13018·cs.LG·June 7, 2022

The classification for High-dimension low-sample size data

Liran Shen, Meng Joo Er, Qingbo Yin

PDF

TL;DR

This paper introduces a new classification method, NPDMD, designed specifically for high-dimensional low-sample-size data, combining statistical and local information to improve accuracy and robustness.

Contribution

The paper proposes a novel linear classifier, NPDMD, tailored for HDLSS data, addressing challenges with a new criterion and efficient implementation.

Findings

01

NPDMD outperforms existing methods on benchmark datasets.

02

It effectively handles HDLSS data with high accuracy.

03

The method is robust and easy to implement.

Abstract

Huge amount of applications in various fields, such as gene expression analysis or computer vision, undergo data sets with high-dimensional low-sample-size (HDLSS), which has putted forward great challenges for standard statistical and modern machine learning methods. In this paper, we propose a novel classification criterion on HDLSS, tolerance similarity, which emphasizes the maximization of within-class variance on the premise of class separability. According to this criterion, a novel linear binary classifier is designed, denoted by No-separated Data Maximum Dispersion classifier (NPDMD). The objective of NPDMD is to find a projecting direction w in which all of training samples scatter in as large an interval as possible. NPDMD has several characteristics compared to the state-of-the-art classification methods. First, it works well on HDLSS. Second, it combines the sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.