Scheduling of Graph Queries: Controlling Intra- and Inter-query Parallelism for a High System Throughput
Matthias Hauck (1, 2), Ismail Oukid (3), Holger Fr\"oning (1), ((1) Heidelberg University, (2) SAP SE, (3) Snowflake Inc)

TL;DR
This paper presents a method for scheduling graph queries that adaptively controls intra- and inter-query parallelism to maximize system throughput and efficiency across diverse graph data sets and query workloads.
Contribution
It introduces an automatic scheduling approach based on graph sampling, algorithm/system constraints, and work package generation, achieving near-optimal performance with low overhead.
Findings
Robust performance across various configurations.
Performance close to manually optimized implementations.
Effective handling of both large and small query workloads.
Abstract
The vast amounts of data used in social, business or traffic networks, biology and other natural sciences are often managed in graph-based data sets, consisting of a few thousand up to billions and trillions of vertices and edges, respectively. Typical applications utilizing such data either execute one or a few complex queries or many small queries at the same time interactively or as batch jobs. Furthermore, graph processing is inherently complex, as data sets can substantially differ (scale free vs. constant degree), and algorithms exhibit diverse behavior (computational intensity, local or global, push- or pull-based). This work is concerned with multi-query execution by automatically controlling the degree of parallelization, with overall objectives including high system utilization, low synchronization cost, and highly efficient concurrent execution. The underlying concept is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Cloud Computing and Resource Management · Data Management and Algorithms
