# K-nearest Neighbor Search by Random Projection Forests

**Authors:** Donghui Yan, Yingjie Wang, Jin Wang, Honggang Wang, Zhenpeng Li

arXiv: 1812.11689 · 2020-05-27

## TL;DR

This paper introduces rpForests, an ensemble of random projection trees, for efficient kNN search, achieving high accuracy, low complexity, and excellent parallelization capabilities, with theoretical support for its effectiveness.

## Contribution

The paper presents a novel ensemble method called rpForests for kNN search, combining random projection trees with theoretical analysis of its accuracy and efficiency.

## Key findings

- rpForests achieves fast decay in missing kNNs and distance discrepancy.
- The method has very low computational complexity.
- The ensemble approach allows efficient parallel execution.

## Abstract

K-nearest neighbor (kNN) search has wide applications in many areas, including data mining, machine learning, statistics and many applied domains. Inspired by the success of ensemble methods and the flexibility of tree-based methodology, we propose random projection forests (rpForests), for kNN search. rpForests finds kNNs by aggregating results from an ensemble of random projection trees with each constructed recursively through a series of carefully chosen random projections. rpForests achieves a remarkable accuracy in terms of fast decay in the missing rate of kNNs and that of discrepancy in the kNN distances. rpForests has a very low computational complexity. The ensemble nature of rpForests makes it easily run in parallel on multicore or clustered computers; the running time is expected to be nearly inversely proportional to the number of cores or machines. We give theoretical insights by showing the exponential decay of the probability that neighboring points would be separated by ensemble random projection trees when the ensemble size increases. Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.11689/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1812.11689/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1812.11689/full.md

---
Source: https://tomesphere.com/paper/1812.11689