Bayesian Variational Inference for Mixed Data Mixture Models
Junyang Wang, James Bennett, Victor Lhoste, Sarah Filippi

TL;DR
This paper introduces a scalable variational inference algorithm for mixed data mixture models that efficiently captures uncertainty and is validated through simulations and real data analysis.
Contribution
It develops a coordinate ascent variational inference method for mixed data mixture models, providing theoretical guarantees and practical scalability.
Findings
The CAVI algorithm converges to the true parameter at an $O(1/n)$ rate.
The variational posterior contracts around the true parameter at an $O(n^{-1/2})$ rate.
The method is effective on both simulated data and real-world datasets.
Abstract
Heterogeneous, mixed type datasets including both continuous and categorical variables are ubiquitous, and enriches data analysis by allowing for more complex relationships and interactions to be modelled. Mixture models offer a flexible framework for capturing the underlying heterogeneity and relationships in mixed type datasets. Most current approaches for modelling mixed data either forgo uncertainty quantification and only conduct point estimation, and some use MCMC which incurs a very high computational cost that is not scalable to large datasets. This paper develops a coordinate ascent variational inference algorithm (CAVI) for mixture models on mixed (continuous and categorical) data, which circumvents the high computational cost of MCMC while retaining uncertainty quantification. We demonstrate our approach through simulation studies as well as an applied case study of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference
