Clustering -- Basic concepts and methods
Jan-Oliver Felix Kapp-Joswig, Bettina G. Keller

TL;DR
This paper provides an introductory review of clustering techniques, discussing fundamental concepts, data preparation, various algorithms, and validation methods for clustering analysis.
Contribution
It offers a comprehensive overview of clustering methods, including connectivity-based, prototype-based, and density-based approaches, with insights into their implementation and validation.
Findings
Comparison of different clustering algorithms
Discussion on data representation and preparation
Overview of validation techniques for clustering results
Abstract
We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task? And how can clustering results be validated? Connectivity-based versus prototype-based approaches are reflected in the context of several popular methods: single-linkage, spectral embedding, k-means, and Gaussian mixtures are discussed as well as the density-based protocols (H)DBSCAN, Jarvis-Patrick, CommonNN, and density-peaks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
