Optimal weighted nearest neighbour classifiers

Richard J. Samworth

arXiv:1101.5783·math.ST·February 19, 2013

Optimal weighted nearest neighbour classifiers

Richard J. Samworth

PDF

TL;DR

This paper derives the asymptotic optimal weights for weighted nearest-neighbour classifiers, showing their performance depends mainly on the feature dimension and providing insights into improvements over unweighted methods.

Contribution

It introduces an asymptotic expansion for excess risk, identifies optimal weights, and compares their performance to unweighted and bagged classifiers across dimensions.

Findings

01

Optimal weights depend only on feature dimension d.

02

Performance improvement is maximal at d=4.

03

Bagged nearest neighbour is suboptimal for small d but near optimal for large d.

Abstract

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted k-nearest neighbour classifier depends asymptotically only on the dimension d of the feature vectors, and not on the underlying populations. The improvement is greatest when d=4, but thereafter decreases as $d \to \infty$ . The popular bagged nearest neighbour classifier can also be regarded as a weighted nearest neighbour classifier, and we show that its corresponding weights are somewhat suboptimal when d is small (in particular, worse than those of the unweighted k-nearest neighbour classifier when d=1), but are close to optimal when d is large. Finally, we argue that improvements in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.