Mixed data Deep Gaussian Mixture Model: A clustering model for mixed datasets
Robin Fuchs, Denys Pommeret, Cinzia Viroli

TL;DR
This paper introduces the Mixed Deep Gaussian Mixture Model (MDGMM), a flexible multilayer clustering approach for mixed datasets that automatically determines the best model configuration and provides low-dimensional visualizations.
Contribution
The paper presents a novel multilayer architecture that merges clustering of continuous and non-continuous data, generalizing existing models and including an automatic model selection strategy.
Findings
Outperforms state-of-the-art mixed data clustering models
Provides meaningful low-dimensional visualizations
Automatically selects optimal number of clusters
Abstract
Clustering mixed data presents numerous challenges inherent to the very heterogeneous nature of the variables. A clustering algorithm should be able, despite of this heterogeneity, to extract discriminant pieces of information from the variables in order to design groups. In this work we introduce a multilayer architecture model-based clustering method called Mixed Deep Gaussian Mixture Model (MDGMM) that can be viewed as an automatic way to merge the clustering performed separately on continuous and non-continuous data. This architecture is flexible and can be adapted to mixed as well as to continuous or non-continuous data. In this sense we generalize Generalized Linear Latent Variable Models and Deep Gaussian Mixture Models. We also design a new initialisation strategy and a data driven method that selects the best specification of the model and the optimal number of clusters for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Gaussian Processes and Bayesian Inference
