Hybrid Edge Partitioner: Partitioning Large Power-Law Graphs under   Memory Constraints

Ruben Mayer; Hans-Arno Jacobsen

arXiv:2103.12594·cs.DC·March 24, 2021

Hybrid Edge Partitioner: Partitioning Large Power-Law Graphs under Memory Constraints

Ruben Mayer, Hans-Arno Jacobsen

PDF

1 Repo

TL;DR

The paper introduces Hybrid Edge Partitioner (HEP), a system that adaptively partitions large graphs with memory constraints, combining in-memory and streaming methods to improve partition quality and processing speed.

Contribution

HEP is a novel system that dynamically balances memory use and partitioning quality by combining a new in-memory algorithm with streaming partitioning.

Findings

01

HEP outperforms traditional in-memory and streaming partitioners on large real-world graphs.

02

Using HEP significantly speeds up distributed graph processing on Spark/GraphX.

03

HEP effectively balances memory overhead and partition quality in large-scale graph partitioning.

Abstract

Distributed systems that manage and process graph-structured data internally solve a graph partitioning problem to minimize their communication overhead and query run-time. Besides computational complexity -- optimal graph partitioning is NP-hard -- another important consideration is the memory overhead. Real-world graphs often have an immense size, such that loading the complete graph into memory for partitioning is not economical or feasible. Currently, the common approach to reduce memory overhead is to rely on streaming partitioning algorithms. While the latest streaming algorithms lead to reasonable partitioning quality on some graphs, they are still not completely competitive to in-memory partitioners. In this paper, we propose a new system, Hybrid Edge Partitioner (HEP), that can partition graphs that fit partly into memory while yielding a high partitioning quality. HEP can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mayerrn/hybrid_edge_partitioner
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.