Evaluating Regular Path Queries on Compressed Adjacency Matrices
Diego Arroyuelo, Adri\'an G\'omez-Brand\'on, Gonzalo Navarro

TL;DR
This paper introduces a novel sparse matrix-based approach for evaluating Regular Path Queries on graphs, achieving significant space efficiency and fast query processing, especially for complex RPQs with unspecified endpoints.
Contribution
It presents a new Boolean algebra on sparse matrices and a compact $k^2$-tree-based structure that outperforms existing methods in space and time for RPQ evaluation.
Findings
Outperforms previous index in handling complex RPQs
Achieves 4x smaller representation with $k^2$-trees
Solves complex RPQs in a few seconds
Abstract
Regular Path Queries (RPQs), which are essentially regular expressions to be matched against the labels of paths in labeled graphs, are at the core of graph database query languages like SPARQL. A way to solve RPQs is to translate them into a sequence of operations on the adjacency matrices of each label. We design and implement a Boolean algebra on sparse matrix representations and, as an application, use them to handle RPQs. Our baseline representation uses the same space as the previously most compact index for RPQs and outperforms it on the hardest types of queries -- those where both RPQ endpoints are unspecified. Our more succinct structure, based on -trees, is 4 times smaller than any existing representation that handles RPQs, and still solves complex RPQs in a few seconds. Our new sparse-matrix-based representations dominate a good portion of the space/time tradeoff map,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Data Management and Algorithms · Advanced Database Systems and Queries
