clusterNOR: A NUMA-Optimized Clustering Framework
Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal, Burns

TL;DR
clusterNOR is a flexible, NUMA-optimized clustering framework that significantly accelerates various clustering algorithms by reducing memory bottlenecks and barriers, enabling scalable high-performance clustering on large datasets.
Contribution
It introduces a generic, extensible clustering framework optimized for NUMA architectures, with novel algorithms and optimizations that outperform existing solutions.
Findings
Order of magnitude speedup over state-of-the-art solutions
Supports in-memory, semi-external, and distributed execution
Efficient pruning algorithm for billion-point datasets
Abstract
Clustering algorithms are iterative and have complex data access patterns that result in many small random memory accesses. The performance of parallel implementations suffer from synchronous barriers for each iteration and skewed workloads. We rethink the parallelization of clustering for modern non-uniform memory architectures (NUMA) to maximizes independent, asynchronous computation. We eliminate many barriers, reduce remote memory accesses, and maximize cache reuse. We implement the 'Clustering NUMA Optimized Routines' (clusterNOR) extensible parallel framework that provides algorithmic building blocks. The system is generic, we demonstrate nine modern clustering algorithms that have simple implementations. clusterNOR includes (i) in-memory, (ii) semi-external memory, and (iii) distributed memory execution, enabling computation for varying memory and hardware budgets. For algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Caching and Content Delivery
