An I/O-Efficient Disk-based Graph System for Scalable Second-Order Random Walk of Large Graphs
Hongzheng Li, Yingxia Shao, Junping Du, Bin Cui, Lei Chen

TL;DR
This paper presents GraSorw, an I/O-efficient disk-based graph system designed to enable scalable second-order random walks on large graphs, significantly reducing execution time compared to existing systems.
Contribution
The paper introduces novel scheduling and loading strategies to optimize disk I/O for second-order random walks, overcoming scalability limitations of prior disk-based graph systems.
Findings
Reduces end-to-end task time by over ten times.
Efficiently handles large real and synthetic datasets.
Improves I/O utilization with learning-based block loading.
Abstract
Random walk is widely used in many graph analysis tasks, especially the first-order random walk. However, as a simplification of real-world problems, the first-order random walk is poor at modeling higher-order structures in the data. Recently, second-order random walk-based applications (e.g., Node2vec, Second-order PageRank) have become attractive. Due to the complexity of the second-order random walk models and memory limitations, it is not scalable to run second-order random walk-based applications on a single machine. Existing disk-based graph systems are only friendly to the first-order random walk models and suffer from expensive disk I/Os when executing the second-order random walks. This paper introduces an I/O-efficient disk-based graph system for the scalable second-order random walk of large graphs, called GraSorw. First, to eliminate massive light vertex I/Os, we develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Graph Theory and Algorithms · Advanced Graph Neural Networks
