ZEUS: Zero-shot Embeddings for Unsupervised Separation of Tabular Data
Patryk Marsza{\l}ek, Tomasz Ku\'smierczyk, Witold Wydma\'nski, Jacek Tabor, Marek \'Smieja

TL;DR
ZEUS introduces a zero-shot, unsupervised embedding method for clustering tabular data, eliminating the need for dataset-specific tuning and outperforming existing methods in speed and usability.
Contribution
It is the first zero-shot approach for generating embeddings for tabular data in an unsupervised manner, enabling effective clustering without additional training.
Findings
Performs on par or better than traditional clustering algorithms
Faster and more user-friendly than existing deep learning methods
Generalizes across various datasets without fine-tuning
Abstract
Clustering tabular data remains a significant open challenge in data analysis and machine learning. Unlike for image data, similarity between tabular records often varies across datasets, making the definition of clusters highly dataset-dependent. Furthermore, the absence of supervised signals complicates hyperparameter tuning in deep learning clustering methods, frequently resulting in unstable performance. To address these issues and reduce the need for per-dataset tuning, we adopt an emerging approach in deep learning: zero-shot learning. We propose ZEUS, a self-contained model capable of clustering new datasets without any additional training or fine-tuning. It operates by decomposing complex datasets into meaningful components that can then be clustered effectively. Thanks to pre-training on synthetic datasets generated from a latent-variable prior, it generalizes across various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Face recognition and analysis · Machine Learning and Data Classification
MethodsADaptive gradient method with the OPTimal convergence rate
