BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data
Panagiotis Papastamoulis, Magnus Rattray

TL;DR
BayesBinMix is an R package that uses Bayesian methods and MCMC sampling to cluster multivariate binary data, effectively handling missing values and estimating the number of clusters.
Contribution
It introduces a Bayesian clustering approach with MCMC for multivariate Bernoulli data, including methods for model selection and label switching.
Findings
Outperforms EM algorithm in simulation studies
Effectively handles missing data
Accurately estimates number of clusters
Abstract
The BayesBinMix package offers a Bayesian framework for clustering binary data with or without missing values by fitting mixtures of multivariate Bernoulli distributions with an unknown number of components. It allows the joint estimation of the number of clusters and model parameters using Markov chain Monte Carlo sampling. Heated chains are run in parallel and accelerate the convergence to the target posterior distribution. Identifiability issues are addressed by implementing label switching algorithms. The package is demonstrated and benchmarked against the Expectation-Maximization algorithm using a simulation study as well as a real dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
