Estimating network degree distributions under sampling: An inverse   problem, with applications to monitoring social media networks

Yaonan Zhang; Eric D. Kolaczyk; Bruce D. Spencer

arXiv:1305.4977·stat.ME·May 29, 2015

Estimating network degree distributions under sampling: An inverse problem, with applications to monitoring social media networks

Yaonan Zhang, Eric D. Kolaczyk, Bruce D. Spencer

PDF

TL;DR

This paper develops a method to accurately estimate the true degree distribution of a network from sampled data by formulating it as an inverse problem and applying regularization techniques, with applications to social media networks.

Contribution

It introduces a constrained, penalized least-squares approach to solve the ill-posed inverse problem of estimating true network degree distributions from samples.

Findings

01

The method accurately reconstructs degree distributions in simulations.

02

It outperforms empirical distributions in real social network data.

03

The approach is effective for both homogeneous and inhomogeneous networks.

Abstract

Networks are a popular tool for representing elements in a system and their interconnectedness. Many observed networks can be viewed as only samples of some true underlying network. Such is frequently the case, for example, in the monitoring and study of massive, online social networks. We study the problem of how to estimate the degree distribution - an object of fundamental interest - of a true underlying network from its sampled network. In particular, we show that this problem can be formulated as an inverse problem. Playing a key role in this formulation is a matrix relating the expectation of our sampled degree distribution to the true underlying degree distribution. Under many network sampling designs, this matrix can be defined entirely in terms of the design and is found to be ill-conditioned. As a result, our inverse problem frequently is ill-posed. Accordingly, we offer a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.