Decentralized Stratified Sampling for Low-Latency Approximate Geospatial Data Stream Processing in Edge-Cloud Architectures
Isam Mashhour Al Jawarneh, Lorenzo Felletti, Luca Foschini, Paolo Bellavista

TL;DR
This paper introduces EdgeApproxGeo, an edge-cloud system with a decentralized geohash-based sampling method, EdgeSOS, enabling low-latency geospatial data analytics with tunable accuracy-efficiency trade-offs.
Contribution
It proposes a novel decentralized sampling algorithm and an edge-cloud architecture for efficient real-time geospatial data processing.
Findings
EdgeApproxGeo achieves significant speedup over cloud-only baselines.
Coarser geohash granularity reduces error by 30%.
System maintains errors below 10% at 80% sampling rate.
Abstract
The exponential growth of geospatial data streams flowing from IoT devices challenges conventional cloud-based analytics, which typically suffer from network bandwidth waste and latency, basically attributed to the data being managed completely by Cloud, such as centralized sampling. To address this gap, we propose EdgeApproxGeo, a novel edge-cloud architecture that performs spatial-stratified online sampling at network edge devices near data sources. Our system introduces a novel sampling method called EdgeSOS, which is a unique decentralized, geohash-based stratified sampling algorithm designed to operate independently at resource-constrained edge nodes without cross-node synchronization, coupled with spatial-aware data distribution and topic routing in Apache Kafka data stream ingestion, aiming at optimizing downstream data stream processing analytics. We evaluated our system on two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
