Enhancing the selection of a model-based clustering with external qualitative variables
Jean-Patrick Baudry, Margarida Cardoso, Gilles Celeux, Maria Jos\'e, Amorim, Ana Sousa Ferreira

TL;DR
This paper proposes a new method for selecting model-based clustering models that incorporate external categorical variables, improving interpretability without compromising data fit.
Contribution
It introduces a joint likelihood criterion that leverages external variables for model selection in clustering, enhancing interpretability and model relevance.
Findings
The proposed criterion effectively balances data fit and external variable relevance.
Numerical experiments demonstrate promising results in model selection.
The approach improves interpretability of clustering results.
Abstract
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which were not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a model and a number of clusters which both fit the data well and take advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Data Management and Algorithms · Advanced Clustering Algorithms Research
