Flexible clustering via hidden hierarchical Dirichlet priors
Antonio Lijoi, Igor Pr\"unster, Giovanni Rebaudo

TL;DR
This paper introduces a flexible Bayesian clustering model using hidden hierarchical Dirichlet priors, improving on existing methods by better handling multiple populations and providing efficient inference algorithms.
Contribution
It develops a novel nonparametric prior for clustering, derives its distribution, and proposes efficient MCMC algorithms, including a new scheme for multiple populations and homogeneity testing.
Findings
The new model offers more flexible clustering across heterogeneous populations.
The derived algorithms outperform traditional nested Dirichlet process methods.
Illustrative examples demonstrate improved clustering and testing capabilities.
Abstract
The Bayesian approach to inference stands out for naturally allowing borrowing information across heterogeneous populations, with different samples possibly sharing the same distribution. A popular Bayesian nonparametric model for clustering probability distributions is the nested Dirichlet process, which however has the drawback of grouping distributions in a single cluster when ties are observed across samples. With the goal of achieving a flexible and effective clustering method for both samples and observations, we investigate a nonparametric prior that arises as the composition of two different discrete random structures and derive a closed-form expression for the induced distribution of the random partition, the fundamental tool regulating the clustering behavior of the model. On the one hand, this allows to gain a deeper insight into the theoretical properties of the model and,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
