Statistical Guarantees for Local Spectral Clustering on Random Neighborhood Graphs
Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani

TL;DR
This paper provides statistical guarantees for the Personalized PageRank algorithm in local spectral clustering, establishing conditions under which it accurately recovers density-based clusters in neighborhood graphs.
Contribution
It introduces a theoretical framework connecting PPR's performance to population-level functionals and characterizes when PPR successfully identifies high-density clusters.
Findings
PPR recovers clusters with small normalized cut and large conductance and local spread.
Theoretical conditions for successful cluster recovery are established.
Empirical results support the theoretical guarantees.
Abstract
We study the Personalized PageRank (PPR) algorithm, a local spectral method for clustering, which extracts clusters using locally-biased random walks around a given seed node. In contrast to previous work, we adopt a classical statistical learning setup, where we obtain samples from an unknown nonparametric distribution, and aim to identify sufficiently salient clusters. We introduce a trio of population-level functionals -- the normalized cut, conductance, and local spread, analogous to graph-based functionals of the same name -- and prove that PPR, run on a neighborhood graph, recovers clusters with small population normalized cut and large conductance and local spread. We apply our general theory to establish that PPR identifies connected regions of high density (density clusters) that satisfy a set of natural geometric conditions. We also show a converse result, that PPR can fail to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Complex Network Analysis Techniques
