Scalable Spectral Clustering with Nystrom Approximation: Practical and   Theoretical Aspects

Farhad Pourkamali-Anaraki

arXiv:2006.14470·cs.LG·February 2, 2021

Scalable Spectral Clustering with Nystrom Approximation: Practical and Theoretical Aspects

Farhad Pourkamali-Anaraki

PDF

TL;DR

This paper introduces a new spectral clustering algorithm that leverages Nystrom approximation more effectively, balancing accuracy and efficiency, and demonstrates its advantages through theoretical analysis and experiments.

Contribution

It presents a principled Nystrom-based spectral clustering method that improves accuracy-efficiency trade-offs and addresses limitations of previous approaches.

Findings

01

Outperforms existing Nystrom-based spectral clustering methods.

02

Provides theoretical bounds on approximation quality.

03

Demonstrates efficiency on real and synthetic datasets.

Abstract

Spectral clustering techniques are valuable tools in signal processing and machine learning for partitioning complex data sets. The effectiveness of spectral clustering stems from constructing a non-linear embedding based on creating a similarity graph and computing the spectral decomposition of the Laplacian matrix. However, spectral clustering methods fail to scale to large data sets because of high computational cost and memory usage. A popular approach for addressing these problems utilizes the Nystrom method, an efficient sampling-based algorithm for computing low-rank approximations to large positive semi-definite matrices. This paper demonstrates how the previously popular approach of Nystrom-based spectral clustering has severe limitations. Existing time-efficient methods ignore critical information by prematurely reducing the rank of the similarity matrix associated with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSpectral Clustering