Learning general Gaussian mixtures with efficient score matching

Sitan Chen; Vasilis Kontonis; Kulin Shah

arXiv:2404.18893·cs.DS·November 20, 2024·1 cites

Learning general Gaussian mixtures with efficient score matching

Sitan Chen, Vasilis Kontonis, Kulin Shah

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel algorithm for learning Gaussian mixture models without separation assumptions, using diffusion models and score matching, achieving efficient sample complexity and runtime.

Contribution

It presents the first diffusion model-based approach with theoretical guarantees for learning Gaussian mixtures in polynomial time.

Findings

01

Achieves sample complexity polynomial in dimension and inverse accuracy

02

Runs in sample-polynomial time with total variation guarantees

03

First to apply diffusion models to unsupervised learning with theoretical bounds

Abstract

We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions. We make no separation assumptions on the underlying mixture components: we only require that the covariance matrices have bounded condition number and that the means and covariances lie in a ball of bounded radius. We give an algorithm that draws $d^{poly (k / ε)}$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler whose output distribution is $ε$ -far from the unknown mixture in total variation. Prior works for this problem either (i) required exponential runtime in the dimension $d$ , (ii) placed strong assumptions on the instance (e.g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$ . Our approach departs from commonly used techniques for this problem like the method of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning General Gaussian Mixtures With Efficient Score Matching· youtube

Taxonomy

TopicsBayesian Methods and Mixture Models

MethodsDiffusion