A PAC-Bayesian Analysis of Distance-Based Classifiers: Why Nearest-Neighbour works!
Thore Graepel, Ralf Herbrich

TL;DR
This paper provides PAC-Bayesian bounds for the generalisation error of K-nearest-neighbour classifiers by framing them in a kernel space and relating prior measures to the redundancy of training examples, explaining why K-NN performs well.
Contribution
It introduces a novel PAC-Bayesian bound for K-NN classifiers based on kernel space analysis and sparsity priors, linking redundancy to generalisation performance.
Findings
Bound is non-trivial even with small sample sizes (~100)
Generalisation error relates to the redundancy of training examples
Sparse priors improve understanding of K-NN's success
Abstract
Abstract We present PAC-Bayesian bounds for the generalisation error of the K-nearest-neighbour classifier (K-NN). This is achieved by casting the K-NN classifier into a kernel space framework in the limit of vanishing kernel bandwidth. We establish a relation between prior measures over the coefficients in the kernel expansion and the induced measure on the weight vectors in kernel space. Defining a sparse prior over the coefficients allows the application of a PAC-Bayesian folk theorem that leads to a generalisation bound that is a function of the number of redundant training examples: those that can be left out without changing the solution. The presented bound requires to quantify a prior belief in the sparseness of the solution and is evaluated after learning when the actual redundancy level is known. Even for small sample size (m ~ 100) the bound gives non-trivial results when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Face and Expression Recognition
Methodsk-Nearest Neighbors
