Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences
Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright,, Michael Jordan

TL;DR
This paper analyzes the likelihood landscape of Gaussian mixture models with three or more components, revealing the existence of bad local maxima and the limitations of the EM algorithm, emphasizing the importance of initialization.
Contribution
It provides fundamental theoretical results on the structure of the likelihood function and the convergence behavior of EM, resolving open questions and clarifying algorithmic limitations.
Findings
Bad local maxima can be arbitrarily worse than global optima.
EM with random initialization often converges to poor critical points.
First-order EM variants avoid strict saddle points almost surely.
Abstract
We provide two fundamental results on the population (infinite-sample) likelihood function of Gaussian mixture models with components. Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians. We prove that the log-likelihood value of these bad local maxima can be arbitrarily worse than that of any global optimum, thereby resolving an open question of Srebro (2007). Our second main result shows that the EM algorithm (or a first-order variant of it) with random initialization will converge to bad critical points with probability at least . We further establish that a first-order variant of EM will not converge to strict saddle points almost surely, indicating that the poor performance of the first-order method can be attributed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Statistical Methods and Inference
