Cardinality Estimation of Subgraph Matching: A Filtering-Sampling Approach
Wonseok Shin, Siwoo Song, Kunsoo Park, Wook-Shin Han

TL;DR
The paper introduces FaSTest, an innovative algorithm for subgraph cardinality estimation that combines filtering and sampling techniques, significantly improving accuracy and efficiency over existing methods in real-world datasets.
Contribution
FaSTest is a novel algorithm that integrates filtering, adaptive tree sampling, and stratified graph sampling to enhance subgraph counting accuracy and efficiency.
Findings
Outperforms state-of-the-art sampling methods by up to 100x in accuracy.
Surpasses GNN-based methods by up to 1000x in accuracy.
Effective on real-world datasets for subgraph counting.
Abstract
Subgraph counting is a fundamental problem in understanding and analyzing graph structured data, yet computationally challenging. This calls for an accurate and efficient algorithm for Subgraph Cardinality Estimation, which is to estimate the number of all isomorphic embeddings of a query graph in a data graph. We present FaSTest, a novel algorithm that combines (1) a powerful filtering technique to significantly reduce the sample space, (2) an adaptive tree sampling algorithm for accurate and efficient estimation, and (3) a worst-case optimal stratified graph sampling algorithm for difficult instances. Extensive experiments on real-world datasets show that FaSTest outperforms state-of-the-art sampling-based methods by up to two orders of magnitude and GNN-based methods by up to three orders of magnitude in terms of accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Bayesian Modeling and Causal Inference · Graph Theory and Algorithms
