PartitionedVC: Partitioned External Memory Graph Analytics Framework for SSDs
Kiran Kumar Matam, Hanieh Hashemi, Murali Annavaram

TL;DR
PartitionedVC introduces a partitioned, SSD-optimized graph analytics framework that selectively loads active vertices and uses multi-log updates, significantly improving out-of-core graph processing performance.
Contribution
It proposes a CSR-based storage and multi-log update mechanism to efficiently process active vertices and reduce SSD read amplification in out-of-core graph analytics.
Findings
Up to 17.84x speedup over existing frameworks.
Effective reduction in SSD read amplification.
Improved performance across multiple graph algorithms.
Abstract
Graph analytics are at the heart of a broad range of applications such as drug discovery, page ranking, and recommendation systems. When graph size exceeds memory size, out-of-core graph processing is needed. For the widely used external memory graph processing systems, accessing storage becomes the bottleneck. We make the observation that nearly all graph algorithms have a dynamically varying number of active vertices that must be processed in each iteration. However, existing graph processing frameworks, such as GraphChi, load the entire graph in each iteration even if a small fraction of the graph is active. This limitation is due to the structure of the data storage used by these systems. In this work, we propose to use a compressed sparse row (CSR) based graph storage that is more amenable for selectively loading only a few active vertices in each iteration. But CSR based systems…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Cloud Computing and Resource Management · Advanced Data Storage Technologies
