Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models
Weihang Xu, Maryam Fazel, Simon S. Du

TL;DR
This paper proves the first global convergence guarantee for gradient EM algorithms applied to over-parameterized Gaussian Mixture Models with more than two components, revealing sublinear convergence behavior.
Contribution
It introduces a novel likelihood-based analysis framework and establishes the first global convergence result for arbitrary GMMs with over-parameterization.
Findings
Gradient EM converges globally with a sublinear rate of O(1/√t)
First proof of global convergence for GMMs with more than 2 components
Identifies exponential trapping regions in over-parameterized GMMs
Abstract
We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with components learns from data that are generated by a single ground truth Gaussian distribution. While results for the special case of 2-Gaussian mixtures are well-known, a general global convergence analysis for arbitrary remains unresolved and faces several new technical barriers since the convergence becomes sub-linear and non-monotonic. To address these challenges, we construct a novel likelihood-based convergence analysis framework and rigorously prove that gradient EM converges globally with a sublinear rate . This is the first global convergence result for Gaussian mixtures with more than components. The sublinear convergence rate is due to the algorithmic nature of learning over-parameterized GMM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Algorithms and Data Compression
