Efficient Transmission and Reconstruction of Dependent Data Streams via Edge Sampling
Joel Wolfrath, Abhishek Chandra

TL;DR
This paper introduces a hybrid edge-cloud system that leverages correlations between data streams to optimize sampling and estimation, significantly reducing WAN traffic while maintaining query accuracy.
Contribution
It proposes a novel optimization framework for adaptive sampling and estimation in geo-distributed data streams, exploiting inter-stream correlations to improve efficiency.
Findings
Reduces WAN traffic by 27-42% compared to existing methods.
Maintains comparable error rates in aggregate queries.
Demonstrates effectiveness on three real-world datasets.
Abstract
Data stream processing is an increasingly important topic due to the prevalence of smart devices and the demand for real-time analytics. Geo-distributed streaming systems, where cloud-based queries utilize data streams from multiple distributed devices, face challenges since wide-area network (WAN) bandwidth is often scarce or expensive. Edge computing allows us to address these bandwidth costs by utilizing resources close to the devices, e.g. to perform sampling over the incoming data streams, which trades downstream query accuracy to reduce the overall transmission cost. In this paper, we leverage the fact that correlations between data streams may exist across devices located in the same geographical region. Using this insight, we develop a hybrid edge-cloud system which systematically trades off between sampling at the edge and estimation of missing values in the cloud to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Data Stream Mining Techniques · IoT and Edge/Fog Computing
