An Efficient Algorithm for Bayesian Nearest Neighbours
Giuseppe Nuti

TL;DR
This paper introduces a fast, exact Bayesian algorithm for determining the optimal number of neighbors in k-NN classification and regression, eliminating the need for simulation and significantly reducing computation time.
Contribution
It presents a novel Bayesian method that computes the posterior distribution of k efficiently without MCMC, using change-point detection in the data.
Findings
Favorable results on UCI datasets
Exact posterior computation without simulation
Significant reduction in computational time
Abstract
K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a data-set, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation - alongside an exact solution for distributions within the exponential family. The central idea is that data points around our target are generated by the same probability distribution, extending outwards over the appropriate, though unknown, number of neighbours. Once the data is projected onto a distance metric of choice, we can transform the choice of k into a change-point detection problem, for which there is an efficient solution: we recursively compute the probability of the last change-point as we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
