Fast and Bayes-consistent nearest neighbors
Klim Efremenko, Aryeh Kontorovich, Moshe Noivirt

TL;DR
This paper introduces a fast, Bayes-consistent nearest neighbor classifier that combines locality-sensitive hashing with a novel theoretical argument, achieving competitive runtime and statistical guarantees.
Contribution
It presents a new nearest neighbor algorithm that is both computationally efficient and statistically optimal, bridging the gap between speed and consistency in NN methods.
Findings
Achieves Bayes consistency with fast evaluation time.
Preprocessing runtime is O(d n log n).
Query runtime is O(d log n).
Abstract
Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects -- either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm's prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size in , our pre-processing phase has runtime , while the evaluation phase has runtime $O(d\log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and Data Classification · Face and Expression Recognition
