Fast and Multiphase Rates for Nearest Neighbor Classifiers
Pengkun Yang, Jingzhao Zhang

TL;DR
This paper investigates how the error rates of nearest neighbor classifiers scale with training data size, revealing phase-dependent behaviors and the influence of data distribution complexity on generalization performance.
Contribution
It provides a theoretical analysis of error scaling laws for nearest neighbor classifiers, showing phase-dependent rates and the impact of data distribution complexity.
Findings
Early phase error decreases rapidly with polynomial dependence on data dimension.
Later phase error decreases slowly with exponential dependence on data dimension.
Benign data distributions lead to polynomial rather than exponential error dependence.
Abstract
We study the scaling of classification error rates with respect to the size of the training dataset. In contrast to classical results where rates are minimax optimal for a problem class, this work starts with the empirical observation that, even for a fixed data distribution, the error scaling can have \emph{diverse} rates across different ranges of sample size. To understand when and why the error rate is non-uniform, we theoretically analyze nearest neighbor classifiers. We show that an error scaling law can have fine-grained rates: in the early phase, the test error depends polynomially on the data dimension and decreases fast; whereas in the later phase, the error depends exponentially on the data dimension and decreases slowly. Our analysis highlights the complexity of the data distribution in determining the test error. When the data are distributed benignly, we show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Face and Expression Recognition
