Benefits of over-parameterization with EM
Ji Xu, Daniel Hsu, Arian Maleki

TL;DR
This paper demonstrates that over-parameterization in EM algorithms helps avoid local optima and improves convergence to the global maximum in Gaussian mixture models, supported by theoretical proofs and empirical evidence.
Contribution
It provides the first theoretical and empirical evidence that over-parameterization enhances EM's ability to find global optima in Gaussian mixture models.
Findings
Over-parameterization enables EM to find the global maximum from almost any initial point.
Introducing redundant weight parameters improves EM's convergence in symmetric Gaussian mixtures.
Empirical results show similar benefits in other Gaussian mixture scenarios.
Abstract
Expectation Maximization (EM) is among the most popular algorithms for maximum likelihood estimation, but it is generally only guaranteed to find its stationary points of the log-likelihood objective. The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. We consider the problem of estimating the mean vectors of a Gaussian mixture model in a scenario where the mixing weights are known. Our study shows that the global behavior of EM, when one uses an over-parameterized model in which the mixing weights are treated as unknown, is better than that when one uses the (correct) model with the mixing weights fixed to the known values. For symmetric Gaussians mixtures with two components, we prove that introducing the (statistically redundant) weight parameters enables EM to find the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Advanced Statistical Methods and Models · Statistical Methods and Inference
