Designing and Comparing RPQ Semantics
Victor Marsault, Antoine Meyer

TL;DR
This paper introduces a formal framework to categorize, compare, and analyze various semantics for regular path queries (RPQs) in property graph databases, addressing the challenge of selecting finite, user-friendly match sets.
Contribution
It formalizes properties of RPQ semantics, demonstrates their mutual exclusivity, and proposes new semantics to guide future language design.
Findings
Some properties of RPQ semantics are mutually exclusive.
Existing semantics often focus on evaluation efficiency.
New semantics are proposed to improve usability and expressiveness.
Abstract
Modern property graph database query languages such as Cypher, PGQL, GSQL, and the standard GQL draw inspiration from the formalism of regular path queries (RPQs). In order to output walks explicitly, they depart from the classical and well-studied homomorphism semantics. However, it then becomes difficult to present results to users because RPQs may match infinitely many walks. The aforementioned languages use ad-hoc criteria to select a finite subset of those matches. For instance, Cypher uses trail semantics, discarding walks with repeated edges; PGQL and GSQL use shortest walk semantics, retaining only the walks of minimal length among all matched walks; and GQL allows users to choose from several semantics. Even though there is academic research on these semantics, it focuses almost exclusively on evaluation efficiency. In an attempt to better understand, choose and design RPQ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Database Systems and Queries · Data Management and Algorithms
