Beyond Data Points: Regionalizing Crowdsourced Latency Measurements
Taveesh Sharma, Paul Schmitt, Francesco Bronzino, Nick Feamster,, Nicole Marwell

TL;DR
This paper develops spatial analysis techniques to better define geographic boundaries for crowdsourced Internet performance data, improving the accuracy of assessing disparities and informing policy decisions.
Contribution
It introduces a combination of statistical and spatial clustering methods to construct stable, performance-reflective geographic boundaries from crowdsourced datasets.
Findings
Spatial clustering improves boundary stability over traditional methods.
Enhanced boundaries better reflect true Internet performance disparities.
Techniques outperform direct measures over census or neighborhood boundaries.
Abstract
Despite significant investments in access network infrastructure, universal access to high-quality Internet connectivity remains a challenge. Policymakers often rely on large-scale, crowdsourced measurement datasets to assess the distribution of access network performance across geographic areas. These decisions typically rest on the assumption that Internet performance is uniformly distributed within predefined social boundaries. However, this assumption may not be valid for two reasons: crowdsourced measurements often exhibit non-uniform sampling densities within geographic areas; and predefined social boundaries may not align with the actual boundaries of Internet infrastructure. In this paper, we present a spatial analysis on crowdsourced datasets for constructing stable boundaries for sampling Internet performance. We hypothesize that greater stability in sampling boundaries will…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial and Panel Data Analysis
