Optimal Bounds-Only Pruning for Spatial AkNN Joins
Dominik Winecki

TL;DR
This paper introduces an optimal bounds-only pruning method for exact Euclidean AkNN joins on spatial datasets, leveraging partition statistics to efficiently reduce join computations.
Contribution
It presents a novel three-bound proximity test that better captures directional semantics for pruning, improving efficiency over existing methods.
Findings
The proposed algorithm is proven to be both optimal and efficient.
It effectively localizes join evaluations to fewer partitions.
Experimental results demonstrate significant pruning improvements.
Abstract
We propose a bounds-only pruning test for exact Euclidean AkNN joins on partitioned spatial datasets. Data warehouses commonly partition large tables and store row group statistics for them to accelerate searches and joins, rather than maintaining indexes. AkNN joins can benefit from such statistics by constructing bounds and localizing join evaluations to a few partitions before loading them to build spatial indexes. Existing pruning methods are overly conservative for bounds-only spatial data because they do not fully capture its directional semantics, thereby missing opportunities to skip unneeded partitions at the earliest stages of a join. We propose a three-bound proximity test to determine whether all points within a partition have a closer neighbor in one partition than in another, potentially occluded partition. We show that our algorithm is both optimal and efficient.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Data Quality and Management · Advanced Database Systems and Queries
