A Randomized Approach to Efficient Kernel Clustering
Farhad Pourkamali-Anaraki, Stephen Becker

TL;DR
This paper introduces a randomized kernel approximation method for efficient kernel clustering that significantly reduces memory usage while maintaining accuracy, making kernel-based clustering more scalable for large datasets.
Contribution
The paper provides a new analysis and a specific one-pass randomized kernel approximation method that requires less memory than traditional approaches.
Findings
The proposed method is accurate in clustering tasks.
It requires drastically less memory than standard kernel K-means.
It outperforms Nystrom-based approximations in memory efficiency.
Abstract
Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of data points. We provide a new analysis of a class of approximate kernel methods that have more modest memory requirements, and propose a specific one-pass randomized kernel approximation followed by standard K-means on the transformed data. The analysis and experiments suggest the method is accurate, while requiring drastically less memory than standard kernel K-means and significantly less memory than Nystrom based approximations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodsk-Means Clustering
