Learning general Gaussian mixtures with efficient score matching
Sitan Chen, Vasilis Kontonis, Kulin Shah

TL;DR
This paper introduces a novel algorithm for learning Gaussian mixture models without separation assumptions, using diffusion models and score matching, achieving efficient sample complexity and runtime.
Contribution
It presents the first diffusion model-based approach with theoretical guarantees for learning Gaussian mixtures in polynomial time.
Findings
Achieves sample complexity polynomial in dimension and inverse accuracy
Runs in sample-polynomial time with total variation guarantees
First to apply diffusion models to unsupervised learning with theoretical bounds
Abstract
We study the problem of learning mixtures of Gaussians in dimensions. We make no separation assumptions on the underlying mixture components: we only require that the covariance matrices have bounded condition number and that the means and covariances lie in a ball of bounded radius. We give an algorithm that draws samples from the target mixture, runs in sample-polynomial time, and constructs a sampler whose output distribution is -far from the unknown mixture in total variation. Prior works for this problem either (i) required exponential runtime in the dimension , (ii) placed strong assumptions on the instance (e.g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components . Our approach departs from commonly used techniques for this problem like the method of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Learning General Gaussian Mixtures With Efficient Score Matching· youtube
Taxonomy
TopicsBayesian Methods and Mixture Models
MethodsDiffusion
