gMark: Schema-Driven Generation of Graphs and Queries

Guillaume Bagan; Angela Bonifati; Radu Ciucanu; George H. L. Fletcher,; Aur\'elien Lemay; Nicky Advokaat

arXiv:1511.08386·cs.DB·December 7, 2016

gMark: Schema-Driven Generation of Graphs and Queries

Guillaume Bagan, Angela Bonifati, Radu Ciucanu, George H. L. Fletcher,, Aur\'elien Lemay, Nicky Advokaat

PDF

3 Repos

TL;DR

gMark is a flexible, schema-driven framework for generating controllable graph instances and query workloads, supporting diverse properties, regular path queries, and selectivity estimation, aiding research in graph database systems.

Contribution

It introduces gMark, a novel, domain-independent generator for graph data and queries with controllable properties and schema-driven selectivity estimation.

Findings

01

Supports regular path queries and schema-driven selectivity estimation.

02

Enables generation of high-quality, diverse graph instances and workloads.

03

Demonstrates practical usability across various application domains.

Abstract

Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the experimental study of these systems, it is vital that the research community has shared solutions for the generation of database instances and query workloads having predictable and controllable properties. In this paper, we present the design and engineering principles of gMark, a domain- and query language-independent graph instance and query workload generator. A core contribution of gMark is its ability to target and control the diversity of properties of both the generated instances and the generated workloads coupled to these instances. Further novelties include support for regular path queries, a fundamental graph query paradigm, and schema-driven selectivity estimation of queries, a key feature in controlling workload chokepoints.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.