Clean up or mess up: the effect of sampling biases on measurements of degree distributions in mobile phone datasets
Adeline Decuyper, Arnaud Browet, Vincent Traag, Vincent D. Blondel and, Jean-Charles Delvenne

TL;DR
This paper investigates how sampling biases in mobile phone datasets, especially limited temporal coverage, significantly affect the observed degree distributions, including the emergence of Double Pareto LogNormal patterns.
Contribution
It highlights the impact of temporal sampling biases on degree distribution measurements and explains the emergence of DPLN distributions in mobile phone data.
Findings
Sampling biases alter degree distribution results.
Limited temporal coverage influences network analysis.
Double Pareto LogNormal distributions can arise from data biases.
Abstract
Mobile phone data have been extensively used in the recent years to study social behavior. However, most of these studies are based on only partial data whose coverage is limited both in space and time. In this paper, we point to an observation that the bias due to the limited coverage in time may have an important influence on the results of the analyses performed. In particular, we observe significant differences, both qualitatively and quantitatively, in the degree distribution of the network, depending on the way the dataset is pre-processed and we present a possible explanation for the emergence of Double Pareto LogNormal (DPLN) degree distributions in temporal data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Human Mobility and Location-Based Analysis
