Gaussian Mixture Model with Rare Events
Xuetong Li, Jing Zhou, Hansheng Wang

TL;DR
This paper analyzes the slow convergence of the EM algorithm in Gaussian Mixture Models with rare events and proposes a Mixed EM algorithm utilizing partially labeled data to improve convergence.
Contribution
The paper provides a theoretical explanation for EM's slow convergence with rare events and introduces a novel Mixed EM algorithm that enhances convergence using partially labeled data.
Findings
Spectral radius close to 1 explains slow convergence
Mixed EM significantly improves convergence rate
Method performs well on real-world traffic sign data
Abstract
We study here a Gaussian Mixture Model (GMM) with rare events data. In this case, the commonly used Expectation-Maximization (EM) algorithm exhibits extremely slow numerical convergence rate. To theoretically understand this phenomenon, we formulate the numerical convergence problem of the EM algorithm with rare events data as a problem about a contraction operator. Theoretical analysis reveals that the spectral radius of the contraction operator in this case could be arbitrarily close to 1 asymptotically. This theoretical finding explains the empirical slow numerical convergence of the EM algorithm with rare events data. To overcome this challenge, a Mixed EM (MEM) algorithm is developed, which utilizes the information provided by partially labeled data. As compared with the standard EM algorithm, the key feature of the MEM algorithm is that it requires additionally labeled data. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models
