On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices
Geoffrey Sanders, Roger Pearce, Timothy La Fond, Jeremy Kepner

TL;DR
This paper presents methods for generating large-scale, realistic Kronecker product graphs with verifiable triangle statistics, enabling efficient benchmarking of distributed graph algorithms with exact ground-truth calculations.
Contribution
It introduces formulas for calculating triangle participation at vertices and edges in Kronecker product graphs, improving reproducibility and efficiency in graph analytics benchmarking.
Findings
Kronecker graphs can be generated with exact triangle statistics.
Triangle participation formulas enable efficient ground-truth computation.
Graphs are highly compressible and suitable for distributed environments.
Abstract
Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators used in benchmarking are somewhat lacking in this respect due to their randomness: the output of a desired graph analytic can only be compared to expected values and not exact ground truth. Nonstochastic Kronecker product graphs meet these design criteria for several graph analytics. Here we show that many flavors of triangle participation can be cheaply calculated while generating a Kronecker product graph. Given two medium-sized scale-free graphs with adjacency matrices and , their Kronecker product graph has adjacency matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
