Seeking the Truth Beyond the Data. An Unsupervised Machine Learning Approach
Dimitrios Saligkaras, Vasileios E. Papageorgiou

TL;DR
This paper reviews and compares major clustering algorithms in unsupervised machine learning, analyzing their efficiency, accuracy, and complexity across different datasets to guide their application based on data size.
Contribution
It provides a comprehensive review of clustering methods, including parameter selection and initialization, and compares their performance on multiple datasets, highlighting strengths and weaknesses.
Findings
Clustering algorithms vary in accuracy and complexity depending on dataset size.
The paper identifies optimal conditions for different clustering techniques.
Results guide the choice of clustering methods based on data characteristics.
Abstract
Clustering is an unsupervised machine learning methodology where unlabeled elements/objects are grouped together aiming to the construction of well-established clusters that their elements are classified according to their similarity. The goal of this process is to provide a useful aid to the researcher that will help her/him to identify patterns among the data. Dealing with large databases, such patterns may not be easily detectable without the contribution of a clustering algorithm. This article provides a deep description of the most widely used clustering methodologies accompanied by useful presentations concerning suitable parameter selection and initializations. Simultaneously, this article not only represents a review highlighting the major elements of examined clustering techniques but emphasizes the comparison of these algorithms' clustering efficiency based on 3 datasets,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Text and Document Classification Technologies · Data Mining Algorithms and Applications
