Provable and practical approximations for the degree distribution using sublinear graph samples
Talya Eden, Shweta Jain, Ali Pinar, Dana Ron, C. Seshadhri

TL;DR
This paper introduces SADDLES, a new sublinear algorithm for accurately estimating the degree distribution of massive graphs using minimal samples, with strong theoretical guarantees and excellent empirical performance.
Contribution
The paper presents SADDLES, a novel sublinear algorithm for degree distribution estimation, with provable accuracy and efficiency on large graphs, improving over existing methods.
Findings
SADDLES achieves accurate degree distribution estimates by sampling at most 1% of vertices.
The algorithm is provably sublinear in graph size for graphs with large degree distribution indices.
Empirical results show SADDLES outperforms state-of-the-art sampling algorithms in accuracy and sample efficiency.
Abstract
The degree distribution is one of the most fundamental properties used in the analysis of massive graphs. There is a large literature on graph sampling, where the goal is to estimate properties (especially the degree distribution) of a large graph through a small, random sample. The degree distribution estimation poses a significant challenge, due to its heavy-tailed nature and the large variance in degrees. We design a new algorithm, SADDLES, for this problem, using recent mathematical techniques from the field of sublinear algorithms. The SADDLES algorithm gives provably accurate outputs for all values of the degree distribution. For the analysis, we define two fatness measures of the degree distribution, called the -index and the -index. We prove that SADDLES is sublinear in the graph size when these indices are large. A corollary of this result is a provably sublinear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
