Tight bounds for learning a mixture of two gaussians
Moritz Hardt, Eric Price

TL;DR
This paper establishes optimal sample complexity bounds for learning the parameters of a two-component Gaussian mixture in any dimension, using a moment-based estimator with efficient algorithms and identifying special cases with reduced sample needs.
Contribution
It provides the first tight bounds on the sample complexity for Gaussian mixture learning, extending to multiple dimensions and mixtures of more than two Gaussians, with novel dimensionality reduction techniques.
Findings
Optimal sample complexity of Θ(σ^{12}) in 1D
Extension of bounds to higher dimensions with logarithmic loss
Reduced sample complexity in special cases like well-separated means
Abstract
We consider the problem of identifying the parameters of an unknown mixture of two arbitrary -dimensional gaussians from a sequence of independent random samples. Our main results are upper and lower bounds giving a computationally efficient moment-based estimator with an optimal convergence rate, thus resolving a problem introduced by Pearson (1894). Denoting by the variance of the unknown mixture, we prove that samples are necessary and sufficient to estimate each parameter up to constant additive error when Our upper bound extends to arbitrary dimension up to a (provably necessary) logarithmic loss in using a novel---yet simple---dimensionality reduction technique. We further identify several interesting special cases where the sample complexity is notably smaller than our optimal worst-case bound. For instance, if the means of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Bayesian Methods and Mixture Models · Domain Adaptation and Few-Shot Learning
