A Bayesian reassessment of nearest-neighbour classification
Lionel Cucala, Jean-Michel Marin, Christian Robert, Mike, Titterington

TL;DR
This paper redefines the k-nearest-neighbour classification as a Bayesian model, providing new computational tools and demonstrating the approach's effectiveness and limitations through benchmark datasets.
Contribution
It establishes a clear probabilistic framework for k-NN, introduces Bayesian inference methods, and compares their performance with traditional approximations.
Findings
Bayesian model offers a rigorous foundation for k-NN.
Perfect sampling improves inference accuracy.
Pseudo-likelihood approximation has notable limitations.
Abstract
The k-nearest-neighbour procedure is a well-known deterministic method used in supervised classification. This paper proposes a reassessment of this approach as a statistical technique derived from a proper probabilistic model; in particular, we modify the assessment made in a previous analysis of this method undertaken by Holmes and Adams (2002,2003), and evaluated by Manocha and Girolami (2007), where the underlying probabilistic model is not completely well-defined. Once a clear probabilistic basis for the k-nearest-neighbour procedure is established, we derive computational tools for conducting Bayesian inference on the parameters of the corresponding model. In particular, we assess the difficulties inherent to pseudo-likelihood and to path sampling approximations of an intractable normalising constant, and propose a perfect sampling strategy to implement a correct MCMC sampler…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Anomaly Detection Techniques and Applications · Statistical Methods and Inference
