A simple multithreaded implementation of the EM algorithm for mixture models
Sharon X Lee, Kaleb L Lee, and Geoffrey J McLachlan

TL;DR
This paper presents a straightforward multithreaded implementation of the EM algorithm, enabling faster fitting of complex mixture models like skew normal and skew t-distributions on multicore systems.
Contribution
It introduces a simple, parallelized EM algorithm that can be easily integrated into existing code for efficient estimation of various mixture models.
Findings
Significant reduction in computation time for large datasets.
Applicable to a wide range of mixture models including normal, t-, skew normal, and skew t-distributions.
Easy to implement with minimal code modifications.
Abstract
Finite mixture models have been widely used for the modelling and analysis of data from heterogeneous populations. Maximum likelihood estimation of the parameters is typically carried out via the Expectation-Maximization (EM) algorithm. The complexity of the implementation of the algorithm depends on the parametric distribution that is adopted as the component densities of the mixture model. In the case of the skew normal and skew t-distributions, for example, the E-step would involve complicated expressions that are computationally expensive to evaluate. This can become quite time-consuming for large and/or high-dimensional datasets. In this paper, we develop a multithreaded version of the EM algorithm for the fitting of finite mixture models. Due to the structure of the algorithm for these models, the E- and M-steps can be easily reformulated to be executed in parallel across multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Distribution Estimation and Applications · Statistical Methods and Bayesian Inference
