Spectral Clustering Oracles in Sublinear Time

Grzegorz Gluch; Michael Kapralov; Silvio Lattanzi; Aida Mousavifar,; Christian Sohler

arXiv:2101.05549·cs.DS·October 20, 2021

Spectral Clustering Oracles in Sublinear Time

Grzegorz Gluch, Michael Kapralov, Silvio Lattanzi, Aida Mousavifar,, Christian Sohler

PDF

Open Access

TL;DR

This paper introduces a nearly linear time spectral clustering oracle for graphs partitioned into expanders, enabling fast vertex classification with low misclassification error by leveraging sublinear random walk-based spectral embedding approximations.

Contribution

It develops a spectral clustering oracle with sublinear query time and nearly linear preprocessing, using novel random walk distribution estimates to achieve high-precision spectral embeddings.

Findings

01

Query time is $O^*(n^{1/2+O(\epsilon)})$ for the oracle.

02

Preprocessing time is $2^{O(rac{1}{\epsilon} k^4 \log^2(k))} n^{1/2+O(\epsilon)}$.

03

Achieves $O(\epsilon \log k)$ misclassification error per cluster.

Abstract

Given a graph $G$ that can be partitioned into $k$ disjoint expanders with outer conductance upper bounded by $ϵ ≪ 1$ , can we efficiently construct a small space data structure that allows quickly classifying vertices of $G$ according to the expander (cluster) they belong to? Formally, we would like an efficient local computation algorithm that misclassifies at most an $O (ϵ)$ fraction of vertices in every expander. We refer to such a data structure as a \textit{spectral clustering oracle}. Our main result is a spectral clustering oracle with query time $O^{*} (n^{1/2 + O (ϵ)})$ and preprocessing time $2^{O (\frac{1}{ϵ} k^{4} l o g^{2} (k))} n^{1/2 + O (ϵ)}$ that provides misclassification error $O (ϵ lo g k)$ per cluster for any $ϵ ≪ 1/ lo g k$ . More generally, query time can be reduced at the expense of increasing the preprocessing time appropriately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Network Analysis Techniques · Advanced Clustering Algorithms Research · Sparse and Compressive Sensing Techniques