Optimal Bayesian estimators for latent variable cluster models
Riccardo Rastelli, Nial Friel

TL;DR
This paper introduces an efficient Bayesian decision-theoretic method for optimal clustering that automatically determines the number of groups, applicable across various models like mixture models and network block models.
Contribution
It proposes a fast greedy algorithm for finding optimal clusterings and addresses both clustering and model selection simultaneously.
Findings
The method effectively identifies optimal partitions in diverse models.
It automatically determines the number of clusters without prior specification.
The approach is validated on real datasets across different clustering frameworks.
Abstract
In cluster analysis interest lies in probabilistically capturing partitions of individuals, items or observations into groups, such that those belonging to the same group share similar attributes or relational profiles. Bayesian posterior samples for the latent allocation variables can be effectively obtained in a wide range of clustering models, including finite mixtures, infinite mixtures, hidden Markov models and block models for networks. However, due to the categorical nature of the clustering variables and the lack of scalable algorithms, summary tools that can interpret such samples are not available. We adopt a Bayesian decision theoretic approach to define an optimality criterion for clusterings, and propose a fast and context-independent greedy algorithm to find the best allocations. One important facet of our approach is that the optimal number of groups is automatically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Census and Population Estimation
