A Walk in Facebook: Uniform Sampling of Users in Online Social Networks
Minas Gjoka, Maciej Kurant, Carter T. Butts, Athina Markopoulou

TL;DR
This paper develops and evaluates methods for obtaining uniform samples of users in online social networks, demonstrating techniques like MHRW and RWRW, and applying them to Facebook to analyze its user properties.
Contribution
It introduces practical sampling frameworks with convergence diagnostics for OSNs and provides the first representative Facebook user sample for analysis.
Findings
MHRW and RWRW produce approximately uniform samples
BFS and unadjusted RW lead to biased results
Online diagnostics effectively assess sample quality
Abstract
Our goal in this paper is to develop a practical framework for obtaining a uniform sample of users in an online social network (OSN) by crawling its social graph. Such a sample allows to estimate any user property and some topological properties as well. To this end, first, we consider and compare several candidate crawling techniques. Two approaches that can produce approximately uniform samples are the Metropolis-Hasting random walk (MHRW) and a re-weighted random walk (RWRW). Both have pros and cons, which we demonstrate through a comparison to each other as well as to the "ground truth." In contrast, using Breadth-First-Search (BFS) or an unadjusted Random Walk (RW) leads to substantially biased results. Second, and in addition to offline performance assessment, we introduce online formal convergence diagnostics to assess sample quality during the data collection process. We show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Human Mobility and Location-Based Analysis · Spam and Phishing Detection
