An Efficient Algorithm for Bayesian Nearest Neighbours

Giuseppe Nuti

arXiv:1705.09407·cs.LG·June 5, 2017

An Efficient Algorithm for Bayesian Nearest Neighbours

Giuseppe Nuti

PDF

TL;DR

This paper introduces a fast, exact Bayesian algorithm for determining the optimal number of neighbors in k-NN classification and regression, eliminating the need for simulation and significantly reducing computation time.

Contribution

It presents a novel Bayesian method that computes the posterior distribution of k efficiently without MCMC, using change-point detection in the data.

Findings

01

Favorable results on UCI datasets

02

Exact posterior computation without simulation

03

Significant reduction in computational time

Abstract

K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a data-set, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation - alongside an exact solution for distributions within the exponential family. The central idea is that data points around our target are generated by the same probability distribution, extending outwards over the appropriate, though unknown, number of neighbours. Once the data is projected onto a distance metric of choice, we can transform the choice of k into a change-point detection problem, for which there is an efficient solution: we recursively compute the probability of the last change-point as we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.