# Clustering Multivariate Data using Factor Analytic Bayesian Mixtures   with an Unknown Number of Components

**Authors:** Panagiotis Papastamoulis

arXiv: 1906.00348 · 2019-08-29

## TL;DR

This paper introduces an advanced Bayesian mixture model for clustering multivariate data that automatically determines the number of clusters and covariance structures using overfitting models, MCMC sampling, and model selection criteria.

## Contribution

It extends overfitting Bayesian mixture models by incorporating multiple covariance parameterizations and a robust sampling scheme, improving clustering flexibility and model selection.

## Key findings

- Effective in estimating the number of clusters automatically.
- Performs well on simulated and real datasets.
- Provides an accessible R package for implementation.

## Abstract

Recent work on overfitting Bayesian mixtures of distributions offers a powerful framework for clustering multivariate data using a latent Gaussian model which resembles the factor analysis model. The flexibility provided by overfitting mixture models yields a simple and efficient way in order to estimate the unknown number of clusters and model parameters by Markov chain Monte Carlo (MCMC) sampling. The present study extends this approach by considering a set of eight parameterizations, giving rise to parsimonious representations of the covariance matrix per cluster. A Gibbs sampler combined with a prior parallel tempering scheme is implemented in order to approximately sample from the posterior distribution of the overfitting mixture. The parameterization and number of factors is selected according to the Bayesian Information Criterion. Identifiability issues related to label switching are dealt by post-processing the simulated output with the Equivalence Classes Representatives algorithm. The contributed method and software are demonstrated and compared to similar models estimated using the Expectation-Maximization algorithm on simulated and real datasets. The software is available online at https://CRAN.R-project.org/package=fabMix.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.00348/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1906.00348/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/1906.00348/full.md

---
Source: https://tomesphere.com/paper/1906.00348