Ensemble Method for Cluster Number Determination and Algorithm Selection in Unsupervised Learning
Antoine Zambelli

TL;DR
This paper introduces an ensemble clustering framework that simplifies the process of determining the number of clusters and selecting appropriate algorithms in unsupervised learning, reducing the need for expert knowledge.
Contribution
The novel ensemble framework automates cluster number determination and algorithm selection with minimal input, streamlining unsupervised learning workflows.
Findings
Effective in estimating the number of clusters
Automates algorithm selection process
Reduces reliance on expert knowledge
Abstract
Unsupervised learning, and more specifically clustering, suffers from the need for expertise in the field to be of use. Researchers must make careful and informed decisions on which algorithm to use with which set of hyperparameters for a given dataset. Additionally, researchers may need to determine the number of clusters in the dataset, which is unfortunately itself an input to most clustering algorithms. All of this before embarking on their actual subject matter work. After quantifying the impact of algorithm and hyperparameter selection, we propose an ensemble clustering framework which can be leveraged with minimal input. It can be used to determine both the number of clusters in the dataset and a suitable choice of algorithm to use for a given dataset. A code library is included in the Conclusion for ease of integration.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
MethodsEnsemble Clustering
