[Technical Report] Combining Sampling and Synopses with Worst-Case Optimal Runtime and Quality Guarantees for Graph Pattern Cardinality Estimation
Kyoungmin Kim, Hyeonji Kim, George Fletcher, Wook-Shin Han

TL;DR
This paper introduces Alley, a hybrid graph pattern cardinality estimation method combining sampling and synopses, with worst-case optimal guarantees and superior accuracy demonstrated through extensive experiments.
Contribution
Alley is a novel hybrid approach that integrates sampling and synopses with new strategies, achieving optimal runtime and approximation guarantees for graph pattern cardinality estimation.
Findings
Alley outperforms state-of-the-art methods by up to orders of magnitude in accuracy.
Alley maintains similar efficiency while providing higher accuracy.
Theoretical guarantees ensure worst-case optimal runtime and approximation quality.
Abstract
Graph pattern cardinality estimation is the problem of estimating the number of embeddings of a query graph in a data graph. This fundamental problem arises, for example, during query planning in subgraph matching algorithms. There are two major approaches to solving the problem: sampling and synopsis. Synopsis (or summary)-based methods are fast and accurate if synopses capture information of graphs well. However, these methods suffer from large errors due to loss of information during summarization and inherent assumptions. Sampling-based methods are unbiased but suffer from large estimation variance due to large sample space. To address these limitations, we propose Alley, a hybrid method that combines both sampling and synopses. Alley employs 1) a novel sampling strategy, random walk with intersection, which effectively reduces the sample space, 2) branching to further reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Data Management and Algorithms
