FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs
Da Zheng, Disa Mhembere, Randal Burns, Joshua Vogelstein, Carey E., Priebe, Alexander S. Szalay

TL;DR
FlashGraph enables processing of billion-node graphs on a single server with commodity SSDs by overlapping computation and I/O, achieving near in-memory performance and outperforming distributed engines.
Contribution
The paper introduces a semi-external memory graph engine that efficiently utilizes SSDs and a novel I/O merging technique for high-performance graph processing on a single machine.
Findings
Achieves up to 80% of in-memory performance
Outperforms PowerGraph significantly
Processes billion-node graphs on a single server
Abstract
Graph analysis performs many random reads and writes, thus, these workloads are typically performed in memory. Traditionally, analyzing large graphs requires a cluster of machines so the aggregate memory exceeds the graph size. We demonstrate that a multicore server can process graphs with billions of vertices and hundreds of billions of edges, utilizing commodity SSDs with minimal performance loss. We do so by implementing a graph-processing engine on top of a user-space SSD file system designed for high IOPS and extreme parallelism. Our semi-external memory graph engine called FlashGraph stores vertex state in memory and edge lists on SSDs. It hides latency by overlapping computation with I/O. To save I/O bandwidth, FlashGraph only accesses edge lists requested by applications from SSDs; to increase I/O throughput and reduce CPU overhead for I/O, it conservatively merges I/O requests.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
