Local Minima Structures in Gaussian Mixture Models
Yudong Chen, Dogyoon Song, Xumei Xi, Yuqian Zhang

TL;DR
This paper analyzes the structure of local minima in Gaussian Mixture Models' likelihood landscape, revealing common patterns and providing sharper bounds for one-dimensional cases with three components.
Contribution
It characterizes the structure of all local minima in GMMs, showing they relate to true cluster centers and applies to over- or under-specified models.
Findings
All local minima share a common structure related to true cluster centers.
Local minima can be viewed as combinations of single or multiple component fits.
Sharper approximation bounds are provided for 1D GMMs with three components.
Abstract
We investigate the landscape of the negative log-likelihood function of Gaussian Mixture Models (GMMs) with a general number of components in the population limit. As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models. Our study reveals that all local minima share a common structure that partially identifies the cluster centers (i.e., means of the Gaussian components) of the true location mixture. Specifically, each local minimum can be represented as a non-overlapping combination of two types of sub-configurations: fitting a single mean estimate to multiple Gaussian components or fitting multiple estimates to a single true component. These results apply to settings where the true mixture components satisfy a certain separation condition, and are valid even when the number of components is over-…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
