Statistical properties of sampled networks
Sang Hoon Lee, Pan-Jun Kim, and Hawoong Jeong

TL;DR
This paper analyzes how different sampling methods affect the estimation of topological properties in scale-free networks, highlighting biases and proposing criteria to improve accuracy.
Contribution
It introduces a comparative study of sampling methods and explains the biases in estimating network properties, providing guidelines to mitigate errors.
Findings
Sampling methods cause significant estimation biases.
Different properties are affected differently by sampling.
Criteria are proposed to reduce estimation errors.
Abstract
We study the statistical properties of the sampled scale-free networks, deeply related to the proper identification of various real-world networks. We exploit three methods of sampling and investigate the topological properties such as degree and betweenness centrality distribution, average path length, assortativity, and clustering coefficient of sampled networks compared with those of original networks. It is found that the quantities related to those properties in sampled networks appear to be estimated quite differently for each sampling method. We explain why such a biased estimation of quantities would emerge from the sampling procedure and give appropriate criteria for each sampling method to prevent the quantities from being overestimated or underestimated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Graph theory and applications · Interconnection Networks and Systems
