Approximate Butterfly Counting in Sublinear Time
Chi Luo, Jiaxin Song, Yuhao Zhang, Kai Wang, Zhixing He, Kuan Yang

TL;DR
This paper introduces TLS, a novel sublinear-time sampling algorithm for estimating butterfly counts in large bipartite graphs using limited queries, significantly reducing costs and maintaining accuracy.
Contribution
The paper presents TLS, a practical, theoretically-guaranteed sampling algorithm for butterfly counting in bipartite graphs under a query model, with innovations for variance control and efficiency.
Findings
TLS achieves (1+eps) approximation with sublinear query complexity.
Extensive experiments show TLS reduces query costs and runtime by up to three orders of magnitude.
TLS maintains high accuracy across diverse large bipartite graphs.
Abstract
Bipartite graphs serve as a natural model for representing relationships between two different types of entities. When analyzing bipartite graphs, butterfly counting is a fundamental research problem that aims to count the number of butterflies (i.e., 2x2 bicliques) in a given bipartite graph. While this problem has been extensively studied in the literature, existing algorithms usually necessitate access to a large portion of the entire graph, presenting challenges in real scenarios where graphs are extremely large and I/O costs are expensive. In this paper, we study the butterfly counting problem under the query model, where the following query operations are permitted: degree query, neighbor query, and vertex-pair query. We propose TLS, a practical two-level sampling algorithm that can estimate the butterfly count accurately while accessing only a limited graph structure, achieving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGraph Theory and Algorithms · Complex Network Analysis Techniques · Data Management and Algorithms
