Streaming Graph Partitioning in the Planted Partition Model
Charalampos E. Tsourakakis

TL;DR
This paper introduces an improved streaming graph partitioning method using higher length walks, which enhances partition quality with minimal computational overhead, especially effective in large-scale and dynamic graphs.
Contribution
It proposes a novel use of higher length walks in streaming graph partitioning, demonstrating high probability recovery of true partitions and optimal walk length selection.
Findings
High probability recovery of true partition in planted model
Optimal walk length identified for best partition quality
Experimental validation confirms theoretical advantages
Abstract
The sheer increase in the size of graph data has created a lot of interest into developing efficient distributed graph processing frameworks. Popular existing frameworks such as Graphlab and Pregel rely on balanced graph partitioning in order to minimize communication and achieve work balance. In this work we contribute to the recent research line of streaming graph partitioning \cite{stantonstreaming,stanton,fennel} which computes an approximately balanced -partitioning of the vertex set of a graph using a single pass over the graph stream using degree-based criteria. This graph partitioning framework is well tailored to processing large-scale and dynamic graphs. In this work we introduce the use of higher length walks for streaming graph partitioning and show that their use incurs a minor computational cost which can significantly improve the quality of the graph partition. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Complexity and Algorithms in Graphs · Data Management and Algorithms
