CLAMS: A System for Zero-Shot Model Selection for Clustering
Prabhant Singh, Pieter Gijsbers, Murat Onur Yildirim, Elif Ceren Gok,, Joaquin Vanschoren

TL;DR
This paper introduces CLAMS, an AutoML system that uses dataset similarity based on optimal transport to automatically select the best clustering algorithm, outperforming existing baselines.
Contribution
It presents a novel AutoML pipeline for clustering that leverages dataset similarity measures, expanding AutoML beyond supervised learning.
Findings
Outperforms multiple clustering baselines
Demonstrates effectiveness of similarity-based model selection
Establishes a new AutoML approach for clustering
Abstract
We propose an AutoML system that enables model selection on clustering problems by leveraging optimal transport-based dataset similarity. Our objective is to establish a comprehensive AutoML pipeline for clustering problems and provide recommendations for selecting the most suitable algorithms, thus opening up a new area of AutoML beyond the traditional supervised learning settings. We compare our results against multiple clustering baselines and find that it outperforms all of them, hence demonstrating the utility of similarity-based automated model selection for solving clustering applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications
