Wide-Area Data Analytics
Rachit Agarwal, Jen Rexford (workshop co-chairs) with contributions, from numerous workshop attendees

TL;DR
This paper discusses the challenges and recent research efforts in analyzing data distributed across wide geographic areas, emphasizing the importance of privacy, low latency, and resource constraints.
Contribution
It provides a comprehensive overview of the state of wide-area data analytics and summarizes insights from a dedicated CCC workshop held in 2019.
Findings
Multiple research communities are exploring wide-area data analysis.
Challenges include privacy, latency, and resource management.
Collaborative efforts are shaping future solutions.
Abstract
We increasingly live in a data-driven world, with diverse kinds of data distributed across many locations. In some cases, the datasets are collected from multiple locations, such as sensors (e.g., mobile phones and street cameras) spread throughout a geographic region. The data may need to be analyzed close to where they are produced, particularly when the applications require low latency, high, low cost, user privacy, and regulatory constraints. In other cases, large datasets are distributed across public clouds, private clouds, or edge-cloud computing sites with more plentiful computation, storage, bandwidth, and energy resources. Often, some portion of the analysis may take place on the end-host or edge cloud (to respect user privacy and reduce the volume of data) while relying on remote clouds to complete the analysis (to leverage greater computation and storage resources).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Anomaly Detection Techniques and Applications · Traffic Prediction and Management Techniques
