Classification with Nearest Disjoint Centroids
Nicolas Fraiman, Zichao Li

TL;DR
This paper introduces a novel classification method called the nearest disjoint centroid classifier, which uses feature-disjoint centroids and a normalized distance measure, demonstrating improved accuracy and feature efficiency over existing methods.
Contribution
The paper proposes a new classification approach with disjoint feature centroids and a normalized distance, along with an algorithm for feature subset selection and theoretical analysis.
Findings
Outperforms other classifiers in misclassification rates
Uses fewer features while maintaining accuracy
Effective on both simulated and real gene expression data
Abstract
In this paper, we develop a new classification method based on nearest centroid, and it is called the nearest disjoint centroid classifier. Our method differs from the nearest centroid classifier in the following two aspects: (1) the centroids are defined based on disjoint subsets of features instead of all the features, and (2) the distance is induced by the dimensionality-normalized norm instead of the Euclidean norm. We provide a few theoretical results regarding our method. In addition, we propose a simple algorithm based on adapted k-means clustering that can find the disjoint subsets of features used in our method, and extend the algorithm to perform feature selection. We evaluate and compare the performance of our method to other classification methods on both simulated data and real-world gene expression datasets. The results demonstrate that our method is able to outperform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Face and Expression Recognition · Bioinformatics and Genomic Networks
Methodsk-Means Clustering
