Sampling Online Social Networks: Metropolis Hastings Random Walk and Random Walk
Xiao Qi

TL;DR
This paper introduces Metropolis-Hastings Random Walk and Random Walk with Jumps sampling methods for social network analysis, comparing their effectiveness in estimating various network properties and addressing data accessibility issues.
Contribution
The paper presents novel sampling strategies, MHRW and RWwJ, with theoretical foundations and empirical comparisons to improve social network data sampling.
Findings
MHRW outperforms RWwJ in estimating degree distribution and graph order.
RWwJ provides better estimates for follower/following ratios and mutual relationships.
MHRW reduces error rates in key network property estimations.
Abstract
As social network analysis (SNA) has drawn much attention in recent years, one bottleneck of SNA is these network data are too massive to handle. Furthermore, some network data are not accessible due to privacy problems. Therefore, we have to develop sampling methods to draw representative sample graphs from the population graph. In this paper, Metropolis-Hastings Random Walk (MHRW) and Random Walk with Jumps (RWwJ) sampling strategies are introduced, including the procedure of collecting nodes, the underlying mathematical theory, and corresponding estimators. We compared our methods and existing research outcomes and found that MHRW performs better when estimating degree distribution (61% less error than RWwJ) and graph order (0.69% less error than RWwJ), while RWwJ estimates follower and following ratio average and mutual relationship proportion in adjacent relationship with better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Opinion Dynamics and Social Influence · Mental Health Research Topics
