Spectral Clustering in Birthday Paradox Time
Michael Kapralov, Ekaterina Kochetkova, Weronika Wrzos-Kaminska

TL;DR
This paper introduces a novel vertex representation in clusterable graphs that enables efficient cluster identification using a number of random walk samples aligned with the birthday paradox, improving upon previous methods.
Contribution
The authors develop a new vertex representation based on a mixture of logarithmic length walks, achieving optimal sample complexity and enabling fast cluster membership queries.
Findings
Representation uses approximately (n/k)^{1/2+O(ε/φ^2)} walks per vertex.
Allows nearly linear time cluster identification.
Matches the birthday paradox bound for sample complexity.
Abstract
Given a vertex in a -clusterable graph, i.e. a graph whose vertex set can be partitioned into a disjoint union of -expanders of size with outer conductance bounded by , can one quickly tell which cluster it belongs to? This question goes back to the expansion testing problem of Goldreich and Ron'11. For a sample of logarithmic length walks from a given vertex approximately determines its cluster membership by the birthday paradox: two vertices whose random walk samples are `close' are likely in the same cluster. The study of the general case was initiated by Czumaj, Peng and Sohler [STOC'15], and the works of Chiplunkar et al. [FOCS'18], Gluch et al. [SODA'21] showed that random walk samples suffice for general .…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Complexity and Algorithms in Graphs · Random Matrices and Applications
