Hard-Clustering with Gaussian Mixture Models
Johannes Bl\"omer, Sascha Brauer, Kathrin Bujna

TL;DR
This paper introduces two algorithms for a restricted version of the Gaussian mixture model clustering problem, providing approximate solutions with guarantees on their quality, addressing limitations of existing methods like CEM.
Contribution
The paper proposes novel algorithms for a restricted CMLE problem that achieve near-optimal solutions with provable approximation guarantees.
Findings
Algorithms compute solutions within a factor (1+ε) of optimal
Addresses issues with standard CEM convergence
Provides solutions satisfying natural properties of the CMLE problem
Abstract
Training the parameters of statistical models to describe a given data set is a central task in the field of data mining and machine learning. A very popular and powerful way of parameter estimation is the method of maximum likelihood estimation (MLE). Among the most widely used families of statistical models are mixture models, especially, mixtures of Gaussian distributions. A popular hard-clustering variant of the MLE problem is the so-called complete-data maximum likelihood estimation (CMLE) method. The standard approach to solve the CMLE problem is the Classification-Expectation-Maximization (CEM) algorithm. Unfortunately, it is only guaranteed that the algorithm converges to some (possibly arbitrarily poor) stationary point of the objective function. In this paper, we present two algorithms for a restricted version of the CMLE problem. That is, our algorithms approximate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research · Algorithms and Data Compression
