Mini-Batch Spectral Clustering
Yufei Han, Maurizio Filippone

TL;DR
This paper introduces a scalable spectral clustering method using adaptive stochastic gradient optimization that accurately recovers the Laplacian spectrum and outperforms existing approximate methods on large datasets.
Contribution
It presents a novel stochastic gradient approach for spectral clustering that guarantees exact spectrum recovery in the limit and is computationally efficient for large data sets.
Findings
Method scales to half a million samples
Outperforms state-of-the-art approximate methods
Recovers exact Laplacian spectrum asymptotically
Abstract
The cost of computing the spectrum of Laplacian matrices hinders the application of spectral clustering to large data sets. While approximations recover computational tractability, they can potentially affect clustering performance. This paper proposes a practical approach to learn spectral clustering based on adaptive stochastic gradient optimization. Crucially, the proposed approach recovers the exact spectrum of Laplacian matrices in the limit of the iterations, and the cost of each iteration is linear in the number of samples. Extensive experimental validation on data sets with up to half a million samples demonstrate its scalability and its ability to outperform state-of-the-art approximate methods to learn spectral clustering for a given computational budget.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Sparse and Compressive Sensing Techniques
MethodsSpectral Clustering
