Partitioning Trillion-edge Graphs in Minutes
George M Slota, Sivasankaran Rajamanickam, Karen Devine, Kamesh, Madduri

TL;DR
XtraPuLP is a scalable distributed graph partitioner capable of processing trillion-edge graphs in minutes, achieving high-quality partitions comparable to state-of-the-art methods and significantly reducing graph analytics execution time.
Contribution
This paper introduces XtraPuLP, a novel distributed-memory graph partitioner based on label propagation, enabling fast processing of trillion-edge graphs with high quality.
Findings
XtraPuLP produces partitions comparable to state-of-the-art methods.
It can partition billion+ vertex graphs in minutes.
Using XtraPuLP partitions reduces graph analytics execution time.
Abstract
We introduce XtraPuLP, a new distributed-memory graph partitioner designed to process trillion-edge graphs. XtraPuLP is based on the scalable label propagation community detection technique, which has been demonstrated as a viable means to produce high quality partitions with minimal computation time. On a collection of large sparse graphs, we show that XtraPuLP partitioning quality is comparable to state-of-the-art partitioning methods. We also demonstrate that XtraPuLP can produce partitions of real-world graphs with billion+ vertices in minutes. Further, we show that using XtraPuLP partitions for distributed-memory graph analytics leads to significant end-to-end execution time reduction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
