Faster Parallel Exact Density Peaks Clustering

Yihao Huang; Shangdi Yu; Julian Shun

arXiv:2305.11335·cs.DC·May 22, 2023·1 cites

Faster Parallel Exact Density Peaks Clustering

Yihao Huang, Shangdi Yu, Julian Shun

PDF

Open Access

TL;DR

This paper introduces fast, parallel algorithms for exact Density Peaks Clustering, significantly improving scalability and speed over previous methods, enabling large-scale data analysis in various fields.

Contribution

It presents novel parallel algorithms for exact DPC with optimal work and low span, outperforming existing methods in speed and scalability.

Findings

01

Achieves $O( ext{log} n ext{log} ext{log} n)$ span with priority search kd-trees.

02

Realizes up to 13169x speedup over previous parallel algorithms.

03

Maintains work-efficiency matching the best sequential algorithms.

Abstract

Clustering multidimensional points is a fundamental data mining task, with applications in many fields, such as astronomy, neuroscience, bioinformatics, and computer vision. The goal of clustering algorithms is to group similar objects together. Density-based clustering is a clustering approach that defines clusters as dense regions of points. It has the advantage of being able to detect clusters of arbitrary shapes, rendering it useful in many applications. In this paper, we propose fast parallel algorithms for Density Peaks Clustering (DPC), a popular version of density-based clustering. Existing exact DPC algorithms suffer from low parallelism both in theory and in practice, which limits their application to large-scale data sets. Our most performant algorithm, which is based on priority search kd-trees, achieves $O (lo g n lo g lo g n)$ span (parallel time complexity) for a data set…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Face and Expression Recognition