Analysis of approximate nearest neighbor searching with clustered point   sets

Songrit Maneewongvatana; David M. Mount

arXiv:cs/9901013·cs.CG·May 23, 2007·59 cites

Analysis of approximate nearest neighbor searching with clustered point sets

Songrit Maneewongvatana, David M. Mount

PDF

Open Access

TL;DR

This paper empirically compares different data structures for approximate nearest neighbor searching, showing that alternative methods outperform kd-trees on clustered data and queries.

Contribution

It introduces and evaluates two novel splitting methods, sliding-midpoint and minimum-ambiguity, demonstrating their effectiveness on clustered datasets.

Findings

01

Alternative methods outperform kd-trees on clustered data.

02

Minimum-ambiguity method reduces query ambiguity.

03

Sliding-midpoint balances cell aspect ratio and emptiness.

Abstract

We present an empirical analysis of data structures for approximate nearest neighbor searching. We compare the well-known optimized kd-tree splitting method against two alternative splitting methods. The first, called the sliding-midpoint method, which attempts to balance the goals of producing subdivision cells of bounded aspect ratio, while not producing any empty cells. The second, called the minimum-ambiguity method is a query-based approach. In addition to the data points, it is also given a training set of query points for preprocessing. It employs a simple greedy algorithm to select the splitting plane that minimizes the average amount of ambiguity in the choice of the nearest neighbor for the training points. We provide an empirical analysis comparing these two methods against the optimized kd-tree construction for a number of synthetically generated data and query sets. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms · Advanced Image and Video Retrieval Techniques · Automated Road and Building Extraction