Learned k-NN Distance Estimation
Daichi Amagata, Yusuke Arai, Sumio Fujita, Takahiro Hara

TL;DR
This paper introduces a neural network-based method to estimate k-NN distances rapidly and accurately, significantly reducing data access costs in large-scale proximity analysis tasks.
Contribution
It presents a novel machine learning approach that predicts k-NN distances with O(1) inference time, avoiding extensive data retrieval.
Findings
Achieves high accuracy in distance estimation
Inference time is constant, O(1)
Demonstrates efficiency on real datasets
Abstract
Big data mining is well known to be an important task for data science, because it can provide useful observations and new knowledge hidden in given large datasets. Proximity-based data analysis is particularly utilized in many real-life applications. In such analysis, the distances to k nearest neighbors are usually employed, thus its main bottleneck is derived from data retrieval. Much efforts have been made to improve the efficiency of these analyses. However, they still incur large costs, because they essentially need many data accesses. To avoid this issue, we propose a machine-learning technique that quickly and accurately estimates the k-NN distances (i.e., distances to the k nearest neighbors) of a given query. We train a fully connected neural network model and utilize pivots to achieve accurate estimation. Our model is designed to have useful advantages: it infers distances to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
Methodsk-Nearest Neighbors
