Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures
Mo Zhou, Weihang Xu, Maryam Fazel, Simon S. Du

TL;DR
This paper proves that gradient EM algorithm globally converges for over-parameterized Gaussian Mixture Models with more than two components, under mild over-parameterization, using novel analytical tools.
Contribution
It provides the first global convergence guarantee for gradient EM in over-parameterized GMMs with more than two components, advancing theoretical understanding.
Findings
Gradient EM converges globally for n=Ω(m log m) components.
Convergence occurs at a polynomial rate with polynomial samples.
Introduces new analytical tools using Hermite polynomials and tensor decomposition.
Abstract
Learning Gaussian Mixture Models (GMMs) is a fundamental problem in machine learning, with the Expectation-Maximization (EM) algorithm and its popular variant gradient EM being arguably the most widely used algorithms in practice. In the exact-parameterized setting, where both the ground truth GMM and the learning model have the same number of components , a vast line of work has aimed to establish rigorous recovery guarantees for EM. However, global convergence has only been proven for the case of , and EM is known to fail to recover the ground truth when . In this paper, we consider the setting, where the learning model uses components to fit an -component ground truth GMM. In contrast to the exact-parameterized case, we provide a rigorous global convergence guarantee for gradient EM. Specifically, for any well separated GMMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Tensor decomposition and applications · Gaussian Processes and Bayesian Inference
