Random Sampling over Spatial Range Joins
Daichi Amagata

TL;DR
This paper introduces an efficient algorithm for generating random samples from spatial range join results without executing the full join, significantly reducing computational costs and result set sizes.
Contribution
It presents the first time- and space-efficient algorithm for sampling spatial range join results directly, outperforming baseline methods.
Findings
Algorithm operates in $ ilde{O}(n + m + t)$ expected time
Significantly faster than baseline algorithms in experiments
Effective on real spatial datasets
Abstract
Spatial range joins have many applications, including geographic information systems, location-based social networking services, neuroscience, and visualization. However, joins incur not only expensive computational costs but also too large result sets. A practical and reasonable approach to alleviating these issues is to return random samples of the join results. Although this is promising and sufficient for many applications involving spatial range joins, efficiently computing random samples is not trivial. This is because we must obtain random join samples without running spatial range joins. We address this challenging problem for the first time and aim at designing a time- and space-efficient algorithm. First, we design two baseline algorithms that employ existing techniques for random sampling and show that they are not efficient. Then, we propose a new data structure that can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFacility Location and Emergency Management · Data Management and Algorithms · Computational Geometry and Mesh Generation
