Sublinear-Time Algorithms for Counting Star Subgraphs with Applications to Join Selectivity Estimation
Maryam Aliakbarpour, Amartya Shankha Biswas, Themistoklis Gouleakis,, John Peebles, Ronitt Rubinfeld, Anak Yodpinyanee

TL;DR
This paper introduces sublinear algorithms for estimating sums related to star subgraphs and join selectivity, achieving faster approximation methods by sampling edges directly, surpassing previous vertex-sampling lower bounds.
Contribution
It presents novel sublinear-time algorithms for counting star subgraphs and estimating join selectivity using edge sampling, with tight lower bounds and improved efficiency over prior vertex-based methods.
Findings
Algorithm achieves $(1 \u00b1 \u03b5)$-approximation with query complexity ((m) \, (rac{m \, ext{log log n}}{^2 S_p^{1/p}}))
Provides tight lower bounds for the problem, even with graph structure knowledge
Edge sampling allows surpassing previous lower bounds based on vertex sampling
Abstract
We study the problem of estimating the value of sums of the form when one has the ability to sample with probability proportional to its magnitude. When , this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when is the degree sequence of a graph, which corresponds to counting the number of -stars in a graph when one has the ability to sample edges randomly. Our algorithm for a -multiplicative approximation of has query and time complexities . Here, is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, is the number of vertices in the graph and the number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Complexity and Algorithms in Graphs · Optimization and Search Problems
