Classification with many classes: challenges and pluses
Felix Abramovich, Marianna Pensky

TL;DR
This paper provides a rigorous theoretical analysis of high-dimensional multi-class classification, revealing that increasing the number of classes can sometimes improve classification accuracy due to better feature selection.
Contribution
It offers non-asymptotic and asymptotic conditions for successful classification in high-dimensional settings with many classes, filling a gap in theoretical understanding.
Findings
Large number of classes can enhance feature selection accuracy.
Derived bounds for class separation needed for successful classification.
Observed phenomena supported by simulations and real data.
Abstract
The objective of the paper is to study accuracy of multi-class classification in high-dimensional setting, where the number of classes is also large ("large , large , small " model). While this problem arises in many practical applications and many techniques have been recently developed for its solution, to the best of our knowledge nobody provided a rigorous theoretical analysis of this important setup. The purpose of the present paper is to fill in this gap. We consider one of the most common settings, classification of high-dimensional normal vectors where, unlike standard assumptions, the number of classes could be large. We derive non-asymptotic conditions on effects of significant features, and the low and the upper bounds for distances between classes required for successful feature selection and classification with a given accuracy. Furthermore, we study an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
