Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra
Julian Bellavita, Thomas Pasquali, Laura Del Rio Martin, Flavio Vella,, Giulia Guidi

TL;DR
Popcorn is a GPU-accelerated, open-source implementation of Kernel K-means that uses sparse linear algebra to significantly speed up clustering of large datasets, enabling practical non-linear clustering.
Contribution
This paper introduces a novel formulation of Kernel K-means using sparse linear algebra, enabling efficient GPU implementation with substantial speedups.
Findings
Achieves up to 123.8x speedup over CPU implementation
Outperforms non-sparse GPU Kernel K-means by up to 2.6x
First open-source GPU Kernel K-means implementation
Abstract
K-means is a popular clustering algorithm with significant applications in numerous scientific and engineering areas. One drawback of K-means is its inability to identify non-linearly separable clusters, which may lead to inaccurate solutions in certain cases. Kernel K-means is a variant of classical K-means that can find non-linearly separable clusters. However, it scales quadratically with respect to the size of the dataset, taking several minutes to cluster even medium-sized datasets on traditional CPU-based machines. In this paper, we present a formulation of Kernel K-means using sparse-dense matrix multiplication (SpMM) and sparse matrix-vector multiplication (SpMV), and we show that our formulation enables the rapid implementation of a fast GPU-based version of Kernel K-means with little programming effort. Our implementation, named Popcorn, is the first open-source GPU-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
