Scaling Graph Clustering with Distributed Sketches
Benjamin W. Priest, Alec Dunton, Geoffrey Sanders

TL;DR
This paper introduces a scalable distributed graph clustering method using random projection-based matrix sketches, significantly reducing communication costs and improving performance on dynamic graph streams.
Contribution
The paper proposes a novel spectral clustering approach using matrix sketches from random projections, enabling efficient distributed clustering on large, dynamic graphs.
Findings
Embeddings from random projections produce effective clustering results.
The method reduces communication rounds compared to traditional spectral clustering.
Performance improves with appropriate embedding dimensionality based on model parameters.
Abstract
The unsupervised learning of community structure, in particular the partitioning vertices into clusters or communities, is a canonical and well-studied problem in exploratory graph analysis. However, like most graph analyses the introduction of immense scale presents challenges to traditional methods. Spectral clustering in distributed memory, for example, requires hundreds of expensive bulk-synchronous communication rounds to compute an embedding of vertices to a few eigenvectors of a graph associated matrix. Furthermore, the whole computation may need to be repeated if the underlying graph changes some low percentage of edge updates. We present a method inspired by spectral clustering where we instead use matrix sketches derived from random dimension-reducing projections. We show that our method produces embeddings that yield performant clustering results given a fully-dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSpectral Clustering
