Enhancing cluster analysis via topological manifold learning
Moritz Herrmann, Daniyal Kazempour, Fabian Scheipl, Peer Kr\"oger

TL;DR
This paper demonstrates that inferring the topological structure of data using manifold learning before clustering significantly improves detection accuracy and reduces parameter sensitivity, especially in complex high-dimensional datasets.
Contribution
It introduces a method combining UMAP for topological inference with DBSCAN for clustering, enhancing performance and robustness over traditional approaches.
Findings
Topological pre-processing simplifies clustering.
Clustering in learned manifolds improves accuracy.
Method outperforms complex existing algorithms.
Abstract
We discuss topological aspects of cluster analysis and show that inferring the topological structure of a dataset before clustering it can considerably enhance cluster detection: theoretical arguments and empirical evidence show that clustering embedding vectors, representing the structure of a data manifold instead of the observed feature vectors themselves, is highly beneficial. To demonstrate, we combine manifold learning method UMAP for inferring the topological structure with density-based clustering method DBSCAN. Synthetic and real data results show that this both simplifies and improves clustering in a diverse set of low- and high-dimensional problems including clusters of varying density and/or entangled shapes. Our approach simplifies clustering because topological pre-processing consistently reduces parameter sensitivity of DBSCAN. Clustering the resulting embeddings with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Advanced Clustering Algorithms Research
