Experimental Analysis of Distributed Graph Systems
Khaled Ammar, Tamer Ozsu

TL;DR
This paper conducts a comprehensive experimental comparison of eight distributed graph processing systems across multiple large datasets and workloads, analyzing their performance, scalability, and usability.
Contribution
It provides an independent, empirical evaluation of system performance and scalability, offering insights and tuning heuristics for better efficiency.
Findings
GraphLab (PowerGraph) outperforms others in scalability
Performance varies significantly across datasets and workloads
System tuning heuristics improve overall performance
Abstract
This paper evaluates eight parallel graph processing systems: Hadoop, HaLoop, Vertica, Giraph, GraphLab (PowerGraph), Blogel, Flink Gelly, and GraphX (SPARK) over four very large datasets (Twitter, World Road Network, UK 200705, and ClueWeb) using four workloads (PageRank, WCC, SSSP and K-hop). The main objective is to perform an independent scale-out study by experimentally analyzing the performance, usability, and scalability (using up to 128 machines) of these systems. In addition to performance results, we discuss our experiences in using these systems and suggest some system tuning heuristics that lead to better performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
