A High Performance Implementation of Spectral Clustering on CPU-GPU Platforms
Yu Jin, Joseph F. JaJa

TL;DR
This paper introduces a high-performance spectral clustering implementation on CPU-GPU platforms that significantly outperforms existing software, enabling scalable clustering for large, high-dimensional datasets.
Contribution
The paper presents a novel parallel implementation of spectral clustering leveraging CPU-GPU heterogeneity, improving speed and scalability over traditional Matlab and Python methods.
Findings
Significantly faster than Matlab and Python implementations.
Scales efficiently to large numbers of clusters.
Effective utilization of CPU and GPU resources for spectral clustering.
Abstract
Spectral clustering is one of the most popular graph clustering algorithms, which achieves the best performance for many scientific and engineering applications. However, existing implementations in commonly used software platforms such as Matlab and Python do not scale well for many of the emerging Big Data applications. In this paper, we present a fast implementation of the spectral clustering algorithm on a CPU-GPU heterogeneous platform. Our implementation takes advantage of the computational power of the multi-core CPU and the massive multithreading and SIMD capabilities of GPUs. Given the input as data points in high dimensional space, we propose a parallel scheme to build a sparse similarity graph represented in a standard sparse representation format. Then we compute the smallest eigenvectors of the Laplacian matrix by utilizing the reverse communication interfaces of ARPACK…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
