# A stochastic approach to k-nearest neighbors search using a fixed radius method

**Authors:** Brahian Cano Urrego, Alexander Alsup, Jeffrey A. Thompson, Devin C. Koestler

PMC · DOI: 10.1007/s00180-025-01674-7 · Computational Statistics · 2026-01-13

## TL;DR

This paper introduces a faster way to find the k-nearest neighbors in large datasets by using a stochastic method that reduces computation time without losing accuracy.

## Contribution

A novel stochastic kNN search method is proposed that uses a fixed radius and probabilistic assumptions to reduce computational burden.

## Key findings

- The proposed method outperforms the Brute-force approach in large datasets.
- A 27.57-fold improvement in total elapsed time was observed on an Alzheimer’s disease dataset.
- The method maintains accuracy while significantly reducing computational load.

## Abstract

This study aims to optimize the \documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$\:k$$\end{document}-nearest neighbors search (kNN search) by reducing the computational burden of the well-known Brute-force method while providing the same solution. While there exist rule-based approaches for reducing the computational burden of the kNN search, methods that use the stochastic patterns inherent to the data are lacking. Our method leverages data structures and probabilistic assumptions to enhance the scalability of the search. By focusing on the Training set where our neighbors reside, we define a sample space that limits the \documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$\:k$$\end{document}-nearest neighbors search to a smaller space. For each observation in the Query set (e.g., the set of observations for which a classification is desired), a fixed radius search is employed, with the radius stochastically linked to the desired number of neighbors. This approach allows us to find the \documentclass[12pt]{minimal}
				\usepackage{amsmath}
				\usepackage{wasysym} 
				\usepackage{amsfonts} 
				\usepackage{amssymb} 
				\usepackage{amsbsy}
				\usepackage{mathrsfs}
				\usepackage{upgreek}
				\setlength{\oddsidemargin}{-69pt}
				\begin{document}$$\:k$$\end{document}-nearest neighbors using only a fraction of the entire Training set in contrast to the Brute-force method, which requires distances to be calculated between each observation in the Training set and each observation in the Query set. Through simulations and a theoretical computational complexity analysis, we demonstrate that our method outperforms the Brute-force approach, particularly when the Training and Query set sample sizes are large. In addition, a benchmarked comparison of our approach and the Brute-force method on an Alzheimer’s disease data set further demonstrated this, showing a 27.57-fold improvement in total elapsed time. Overall, our stochastic approach significantly reduces the computational load of kNN search while maintaining accuracy, making it a viable alternative to traditional methods for large datasets.

## Linked entities

- **Diseases:** Alzheimer’s disease (MONDO:0004975)

## Full-text entities

- **Diseases:** Alzheimer's (MESH:D000544), brain tumors (MESH:D001932)
- **Chemicals:** STNNfr (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12799653/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12799653/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12799653/full.md

---
Source: https://tomesphere.com/paper/PMC12799653