Dynamic Interleaving of Content and Structure for Robust Indexing of Semi-Structured Hierarchical Data (Extended Version)
Kevin Wellenzohn, Michael H. B\"ohlen, Sven Helmer

TL;DR
This paper introduces a robust, trie-based index for semi-structured hierarchical data that dynamically interleaves content and structure, enabling efficient and flexible CAS queries with significant performance improvements.
Contribution
It presents a novel dynamic interleaving scheme for composite keys in a trie-based index, enhancing robustness and query support for semi-structured data.
Findings
Supports a wide range of CAS queries including wildcards and descendant axes
Achieves up to two orders of magnitude performance improvement over existing methods
Demonstrates robustness against varying selectivities
Abstract
We propose a robust index for semi-structured hierarchical data that supports content-and-structure (CAS) queries specified by path and value predicates. At the heart of our approach is a novel dynamic interleaving scheme that merges the path and value dimensions of composite keys in a balanced way. We store these keys in our trie-based Robust Content-And-Structure index, which efficiently supports a wide range of CAS queries, including queries with wildcards and descendant axes. Additionally, we show important properties of our scheme, such as robustness against varying selectivities, and demonstrate improvements of up to two orders of magnitude over existing approaches in our experimental evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
