Statistical inference for subgraph counts and clustering coefficient using network sampling in a sparse Stochastic Block Model framework
Anirban Mandal, Arindam Chatterjee

TL;DR
This paper develops limit laws for network sampling estimates of subgraph counts and clustering coefficients in sparse stochastic block models, enabling predictive inference under resource-constrained node sampling.
Contribution
It introduces a model-based approach with new bounds and conditions for Gaussian and Poisson limit laws in sparse network sampling, including ego-centric methods.
Findings
Ego-centric approach handles higher sparsity than induced approach.
Inference quality remains stable below a sparsity threshold.
Transitions from Gaussian to Poisson limits occur with increasing sparsity.
Abstract
This article develops limit laws for network sampling based estimates of subgraph counts and clustering coefficient of a large population network, and uses them for predictive inference. A model based approach is used, where the population network is assumed to be generated from a sparse Stochastic Block Model (SBM). To quantify the effects of node sampling under resource constraints, a sparse Bernoulli node sampling scheme is introduced, where the node selection probability decays to zero as the population size increases. Both induced and ego-centric network formation approaches are explored. Quantitative bounds on the speed of normal approximation for estimated subgraph counts are obtained in a joint model and design based asymptotic framework. These bounds show that inference accuracy depends on model sparsity, sampling sparsity, and features like edge density and minimum vertex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
