Using a Bayesian approach to reconstruct graph statistics after edge sampling
Naomi A. Arnold, Raul J. Mondragon, Richard G. Clegg

TL;DR
This paper introduces a Bayesian methodology to accurately reconstruct key network properties, such as degree distribution and triangle count, from uniformly edge-sampled networks, addressing challenges posed by data collection limitations.
Contribution
It presents a novel Bayesian approach for recovering fundamental network statistics from edge-sampled data without relying on assumptions about the original network.
Findings
Effective recovery of degree distribution and triangle counts demonstrated.
Method performs well on synthetic and real datasets with diverse properties.
No assumptions about original network needed for prior construction.
Abstract
Often, due to prohibitively large size or to limits to data collecting APIs, it is not possible to work with a complete network dataset and sampling is required. A type of sampling which is consistent with Twitter API restrictions is uniform edge sampling. In this paper, we propose a methodology for the recovery of two fundamental network properties from an edge-sampled network: the degree distribution and the triangle count (we estimate the totals for the network and the counts associated with each edge). We use a Bayesian approach and show a range of methods for constructing a prior which does not require assumptions about the original network. Our approach is tested on two synthetic and three real datasets with diverse sizes, degree distributions, degree-degree correlations and triangle count distributions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
