Parameter Estimation in Gaussian Mixture Models with Malicious Noise, without Balanced Mixing Coefficients
Jing Xu, Jakub Marecek

TL;DR
This paper introduces a robust algorithm for estimating the means of imbalanced Gaussian mixtures contaminated with arbitrary noise, providing theoretical sample complexity bounds and demonstrating practical superiority over EM.
Contribution
It presents the first sample complexity bounds for imbalanced Gaussian mixtures with adversarial noise and offers an algorithm that outperforms EM in practice.
Findings
Algorithm achieves accurate mean estimation under adversarial noise.
Provides theoretical bounds on sample complexity based on mixture parameters.
Outperforms EM algorithm in practical estimation error.
Abstract
We consider the problem of estimating means of two Gaussians in a 2-Gaussian mixture, which is not balanced and is corrupted by noise of an arbitrary distribution. We present a robust algorithm to estimate the parameters, together with upper bounds on the numbers of samples required for the estimate to be correct, where the bounds are parametrised by the dimension, ratio of the mixing coefficients, a measure of the separation of the two Gaussians, related to Mahalanobis distance, and a condition number of the covariance matrix. In theory, this is the first sample-complexity result for imbalanced mixtures corrupted by adversarial noise. In practice, our algorithm outperforms the vanilla Expectation-Maximisation (EM) algorithm in terms of estimation error.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
