GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
Reynold S. Xin, Daniel Crankshaw, Ankur Dave, Joseph E. Gonzalez,, Michael J. Franklin, Ion Stoica

TL;DR
GraphX is a unified framework that combines graph-parallel and data-parallel processing, enabling efficient, flexible, and easy-to-use graph analytics pipelines with performance comparable to specialized systems.
Contribution
It introduces GraphX, a novel system that unifies graph-parallel and data-parallel computation within a single framework using relational algebra and query optimization.
Findings
GraphX achieves performance comparable to specialized graph systems.
GraphX outperforms existing systems in end-to-end graph pipelines.
GraphX balances expressiveness, performance, and ease of use.
Abstract
From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and introducing new techniques to partition and distribute the graph, these systems can efficiently execute iterative graph algorithms orders of magnitude faster than more general data-parallel systems. However, the same restrictions that enable the performance gains also make it difficult to express many of the important stages in a typical graph-analytics pipeline: constructing the graph, modifying its structure, or expressing computation that spans multiple graphs. As a consequence, existing graph analytics pipelines compose graph-parallel and data-parallel systems using external storage systems, leading to extensive data movement and complicated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Big Data and Digital Economy
