An Alternative to EM for Gaussian Mixture Models: Batch and Stochastic Riemannian Optimization
Reshad Hosseini, Suvrit Sra

TL;DR
This paper introduces Riemannian optimization techniques as an alternative to EM for Gaussian Mixture Model parameter estimation, demonstrating superior performance through novel algorithms and convergence analysis.
Contribution
It develops new Riemannian gradient algorithms for GMM estimation, improving over EM, with a first non-asymptotic convergence analysis for Riemannian stochastic gradient methods.
Findings
Riemannian formulation initially underperforms compared to EM
Refined Riemannian approach significantly outperforms EM in experiments
First global convergence analysis for Riemannian stochastic gradient methods
Abstract
We consider maximum likelihood estimation for Gaussian Mixture Models (Gmms). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM by appealing to the rich Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop (Riemannian) batch and stochastic gradient algorithms that outperform EM, often substantially.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Statistical Mechanics and Entropy
