AGIS: Fast Approximate Graph Pattern Mining with Structure-Informed Sampling
Seoyong Lee, Jinho Lee

TL;DR
AGIS is a novel, structure-informed sampling system that enables fast, scalable approximate graph pattern mining on massive graphs, significantly outperforming existing methods in speed and pattern diversity.
Contribution
We introduce AGIS, a new sampling technique for AGPM that uses pattern-structure-aware probabilities, enabling scalable and accurate pattern counting on very large graphs.
Findings
Achieves 28.5x speedup over state-of-the-art systems
Scales to graphs with tens of billions of edges
Provides accurate estimates within seconds
Abstract
Approximate Graph Pattern Mining (AGPM) is essential for analyzing large-scale graphs where exact counting is computationally prohibitive. While there exist numerous sampling-based AGPM systems, they all rely on uniform sampling and overlook the underlying probability distribution. This limitation restricts their scalability to a broader range of patterns. In this paper, we introduce AGIS, an extremely fast AGPM system capable of counting arbitrary patterns from huge graphs. AGIS employs structure-informed neighbor sampling, a novel sampling technique that deviates from uniformness but allocates specific sampling probabilities based on the pattern structure. We first derive the ideal sampling distribution for AGPM and then present a practical method to approximate it. Furthermore, we develop a method that balances convergence speed and computational overhead, determining when to use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Advanced Graph Neural Networks · Data Mining Algorithms and Applications
