Sampling from naturally truncated power laws: The matchmaking paradox

I.M. Sokolov; I. Eliazar

arXiv:0907.0078·physics.data-an·May 29, 2013

Sampling from naturally truncated power laws: The matchmaking paradox

I.M. Sokolov, I. Eliazar

PDF

TL;DR

This paper analyzes how sampling from finite, naturally truncated power law networks leads to biased estimates of average links per node, revealing a paradox where parts of bipartite networks appear mismatched due to sampling effects.

Contribution

It introduces a statistical analysis of sample mean fluctuations in truncated power law networks and uncovers the matchmaking paradox in bipartite network sampling.

Findings

01

Sample means are broadly distributed and skewed.

02

Typical sample means are significantly smaller than the true mean.

03

In bipartite networks, sample means of parts differ systematically.

Abstract

Consider a network of M >> 1 nodes connected by N >> 1 links, in which the distribution of the number of links per node follows a power law with exponent 0<\alpha <1. The power law is naturally truncated due to the fact that N is finite. A subset of m << M nodes is sampled arbitrarily, yielding the sample mean \eta : The average number of links per node, within the sampled subset. We explore the statistics of the sample mean \eta and show that its fluctuations around the population mean \nu =N/M are extremely broad and strongly skewed -- yielding typical values which are systematically and significantly smaller than the population mean \nu. Applying these results to the case of bipartite networks, we show that the sample means of the two parts of these networks generally differ -- the fact we call "matchmaking paradox" in the title.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.