Accelerating Loading WebGraphs in ParaGrapher
Mohsen Koohi Esfahani

TL;DR
This paper introduces two optimizations, PG-Fuse and CompBin, to enhance graph loading speed in ParaGrapher by improving storage utilization and decompression bandwidth, achieving significant speedups.
Contribution
The paper presents novel filesystem and data representation optimizations that significantly accelerate large-scale graph loading in ParaGrapher.
Findings
PG-Fuse achieves up to 7.6x speedup in graph loading.
CompBin achieves up to 21.8x speedup in decompression.
Combined, they enable efficient processing of graphs with up to 128 billion edges.
Abstract
ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead. This capability accelerates the design and implementation of new high-performance graph algorithms and their evaluation on a wide range of graphs and across different frameworks. However, our previous study identified two major limitations in ParaGrapher: inefficient utilization of high-bandwidth storage and reduced decompression bandwidth due to increased compression ratios. To address these limitations, we present two optimizations for ParaGrapher in this paper. To improve storage utilization, particularly for high-bandwidth storage, we introduce ParaGrapher-FUSE (PG-Fuse) a filesystem based on the FUSE (Filesystem in User Space). PG-Fuse optimizes storage access by increasing the size of requested blocks, reducing the number of calls to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Database Systems and Queries · Distributed and Parallel Computing Systems
