Estimating and Sampling Graphs with Multidimensional Random Walks
Bruno Ribeiro, Don Towsley

TL;DR
This paper introduces Frontier sampling, a multidimensional random walk method that improves graph characteristic estimation, especially in disconnected or loosely connected graphs, outperforming traditional methods in accuracy and tail sampling.
Contribution
The paper proposes Frontier sampling, a novel multidimensional random walk technique that enhances sampling efficiency and accuracy over existing methods in complex networks.
Findings
Frontier sampling has lower estimation errors in disconnected graphs.
It better samples the tail of the degree distribution.
It retains the desirable properties of regular random walks.
Abstract
Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling may require too many resources (time, bandwidth, or money). Random walks, which normally require fewer resources per sample, can suffer from large estimation errors in the presence of disconnected or loosely connected graphs. In this work we propose a new -dimensional random walk that uses dependent random walkers. We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk. At the same time, our simulations over large real world graphs show that, in the presence of disconnected or loosely connected components, Frontier sampling exhibits lower estimation errors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Graph theory and applications · Topological and Geometric Data Analysis
