Persistent Multiscale Density-based Clustering

Dani\"el Bot; Leland McInnes; Jan Aerts

arXiv:2512.16558·cs.LG·February 3, 2026

Persistent Multiscale Density-based Clustering

Dani\"el Bot, Leland McInnes, Jan Aerts

PDF

Open Access

TL;DR

This paper introduces PLSCAN, a new density-based clustering algorithm that identifies stable clusters across scales, reducing hyperparameter sensitivity and improving robustness in exploratory data analysis.

Contribution

The paper presents PLSCAN, a novel scale-space clustering method based on persistent homology, which efficiently finds stable clusters without extensive hyperparameter tuning.

Findings

01

PLSCAN outperforms HDBSCAN* in clustering accuracy (higher ARI).

02

PLSCAN is less sensitive to the number of neighbors.

03

It has competitive computational costs, especially in low dimensions.

Abstract

Clustering is a cornerstone of modern data analysis. Detecting clusters in exploratory data analyses (EDA) requires algorithms that make few assumptions about the data. Density-based clustering algorithms are particularly well-suited for EDA because they describe high-density regions, assuming only that a density exists. Applying density-based clustering algorithms in practice, however, requires selecting appropriate hyperparameters, which is difficult without prior knowledge of the data distribution. For example, DBSCAN requires selecting a density threshold, and HDBSCAN* relies on a minimum cluster size parameter. In this work, we propose Persistent Leaves Spatial Clustering for Applications with Noise (PLSCAN). This novel density-based clustering algorithm efficiently identifies all minimum cluster sizes for which HDBSCAN* produces stable (leaf) clusters. PLSCAN applies scale-space…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Advanced Clustering Algorithms Research · Data Visualization and Analytics