A factor mixture analysis model for multivariate binary data
Silvia Cagnone, Cinzia Viroli

TL;DR
This paper introduces a novel latent variable model for multivariate binary data that combines factor analysis with mixture modeling to enable dimension reduction and clustering within heterogeneous populations.
Contribution
It replaces Gaussian assumptions with a finite mixture of Gaussians for factors, allowing simultaneous dimension reduction and clustering in binary data analysis.
Findings
Model effectively reduces dimensions in binary data.
Performs accurate clustering in the latent space.
Demonstrated through simulations and real data applications.
Abstract
The paper proposes a latent variable model for binary data coming from an unobserved heterogeneous population. The heterogeneity is taken into account by replacing the traditional assumption of Gaussian distributed factors by a finite mixture of multivariate Gaussians. The aim of the proposed model is twofold: it allows to achieve dimension reduction when the data are dichotomous and, simultaneously, it performs model based clustering in the latent space. Model estimation is obtained by means of a maximum likelihood method via a generalized version of the EM algorithm. In order to evaluate the performance of the model a simulation study and two real applications are illustrated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · Advanced Clustering Algorithms Research
