An Empirical Comparison of Big Graph Frameworks in the Context of Network Analysis
Jannis Koch, Christian L. Staudt, Maximilian Vogel, Henning Meyerhenke

TL;DR
This paper empirically compares four distributed graph processing frameworks for large-scale network analysis, highlighting their performance differences and suitability depending on network size and complexity.
Contribution
It provides a comprehensive performance comparison of GraphLab, Giraph, Giraph++, and Flink for large-scale graph algorithms in a distributed setting.
Findings
GraphLab and Giraph outperform other frameworks in distributed environments.
Distributed frameworks enable analysis of networks with billions of edges.
Shared-memory implementations outperform distributed ones for smaller, memory-resident networks.
Abstract
Complex networks are relational data sets commonly represented as graphs. The analysis of their intricate structure is relevant to many areas of science and commerce, and data sets may reach sizes that require distributed storage and processing. We describe and compare programming models for distributed computing with a focus on graph algorithms for large-scale complex network analysis. Four frameworks - GraphLab, Apache Giraph, Giraph++ and Apache Flink - are used to implement algorithms for the representative problems Connected Components, Community Detection, PageRank and Clustering Coefficients. The implementations are executed on a computer cluster to evaluate the frameworks' suitability in practice and to compare their performance to that of the single-machine, shared-memory parallel network analysis package NetworKit. Out of the distributed frameworks, GraphLab and Apache Giraph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Complex Network Analysis Techniques · Cloud Computing and Resource Management
