Estimation and Model Selection for Model-Based Clustering with the Conditional Classification Likelihood
Jean-Patrick Baudry

TL;DR
This paper provides a theoretical analysis of the ICL criterion in model-based clustering, introducing the conditional classification likelihood and establishing its properties and relation to ICL.
Contribution
It introduces the conditional classification likelihood, studies its properties, and clarifies the theoretical relationship between ICL and this new criterion.
Findings
ICL is an approximation of the proposed criterion.
The new criterion has consistent estimation properties.
Insights into the class notion underlying ICL are provided.
Abstract
The Integrated Completed Likelihood (ICL) criterion has been proposed by Biernacki et al. (2000) in the model-based clustering framework to select a relevant number of classes and has been used by statisticians in various application areas. A theoretical study of this criterion is proposed. A contrast related to the clustering objective is introduced: the conditional classification likelihood. This yields an estimator and a model selection criteria class. The properties of these new procedures are studied and ICL is proved to be an approximation of one of these criteria. We oppose these results to the current leading point of view about ICL, that it would not be consistent. Moreover these results give insights into the class notion underlying ICL and feed a reflection on the class notion in clustering. General results on penalized minimum contrast criteria and on mixture models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Gene expression and cancer classification
