LSMGraph: A High-Performance Dynamic Graph Storage System with Multi-Level CSR
Song Yu, Shufeng Gong, Qian Tao, Sijie Shen, Yanfeng Zhang, Wenyuan, Yu, Pengxi Liu, Zhixin Zhang, Hongfu Li, Xiaojian Luo, Ge Yu, Jingren Zhou

TL;DR
LSMGraph is a disk-based dynamic graph storage system that combines LSM-trees and CSR to optimize both read and write performance, effectively handling large graph data with concurrent updates and analysis.
Contribution
It introduces a novel multi-level structure integrating LSM-trees and CSR, along with a vertex-grained version control for efficient concurrent operations.
Findings
Outperforms existing systems in update workloads
Achieves faster read performance with multi-level CSR
Maintains correctness during concurrent read/write operations
Abstract
The growing volume of graph data may exhaust the main memory. It is crucial to design a disk-based graph storage system to ingest updates and analyze graphs efficiently. However, existing dynamic graph storage systems suffer from read or write amplification and face the challenge of optimizing both read and write performance simultaneously. To address this challenge, we propose LSMGraph, a novel dynamic graph storage system that combines the write-friendly LSM-tree and the read-friendly CSR. It leverages the multi-level structure of LSM-trees to optimize write performance while utilizing the compact CSR structures embedded in the LSM-trees to boost read performance. LSMGraph uses a new memory structure, MemGraph, to efficiently cache graph updates and uses a multi-level index to speed up reads within the multi-level structure. Furthermore, LSMGraph incorporates a vertex-grained version…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
