Raptor Zonal Statistics: Fully Distributed Zonal Statistics of Big Raster + Vector Data [Pre-Print]
Samriddhi Singla, Ahmed Eldawy

TL;DR
This paper introduces a fully distributed system for efficient zonal statistics on petabyte-scale raster and vector data, enabling scalable, ad-hoc geospatial analysis without preprocessing.
Contribution
It presents a novel distributed approach modeled as a join problem, with a theoretical cost model and extensive experiments demonstrating significant performance improvements.
Findings
Achieves up to 100x faster performance than Rasdaman and Google Earth Engine.
Scales efficiently to petabyte-scale raster and vector datasets.
Operates without preprocessing or indexing, suitable for ad-hoc queries.
Abstract
Recent advancements in remote sensing technology have resulted in petabytes of data in raster format. This data is often processed in combination with high resolution vector data that represents, for example, city boundaries. One of the common operations that combine big raster and vector data is the zonal statistics which computes some statistics for each polygon in the vector dataset. This paper models the zonal statistics problem as a join problem and proposes a novel distributed system that can scale to petabytes of raster and vector data. The proposed method does not require any preprocessing or indexing which makes it perfect for ad-hoc queries that scientists usually want to run. We devise a theoretical cost model that proves the efficiency of our algorithm over the baseline method. Furthermore, we run an extensive experimental evaluation on large scale satellite data with up-to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Graph Theory and Algorithms · Advanced Image and Video Retrieval Techniques
