Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures

Mo Zhou; Weihang Xu; Maryam Fazel; Simon S. Du

arXiv:2506.06584·cs.LG·June 10, 2025

Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures

Mo Zhou, Weihang Xu, Maryam Fazel, Simon S. Du

PDF

Open Access

TL;DR

This paper proves that gradient EM algorithm globally converges for over-parameterized Gaussian Mixture Models with more than two components, under mild over-parameterization, using novel analytical tools.

Contribution

It provides the first global convergence guarantee for gradient EM in over-parameterized GMMs with more than two components, advancing theoretical understanding.

Findings

01

Gradient EM converges globally for n=Ω(m log m) components.

02

Convergence occurs at a polynomial rate with polynomial samples.

03

Introduces new analytical tools using Hermite polynomials and tensor decomposition.

Abstract

Learning Gaussian Mixture Models (GMMs) is a fundamental problem in machine learning, with the Expectation-Maximization (EM) algorithm and its popular variant gradient EM being arguably the most widely used algorithms in practice. In the exact-parameterized setting, where both the ground truth GMM and the learning model have the same number of components $m$ , a vast line of work has aimed to establish rigorous recovery guarantees for EM. However, global convergence has only been proven for the case of $m = 2$ , and EM is known to fail to recover the ground truth when $m \geq 3$ . In this paper, we consider the $over-parameterized$ setting, where the learning model uses $n > m$ components to fit an $m$ -component ground truth GMM. In contrast to the exact-parameterized case, we provide a rigorous global convergence guarantee for gradient EM. Specifically, for any well separated GMMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Tensor decomposition and applications · Gaussian Processes and Bayesian Inference