Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean Estimation
Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin, Tian

TL;DR
This paper introduces nearly-linear time algorithms for list-decodable mean estimation and clustering mixtures of distributions, significantly improving runtime over previous methods while maintaining strong statistical guarantees.
Contribution
It presents the first almost-linear time algorithms for clustering mixtures of distributions, bypassing $k$-PCA, and introduces a new robust mean estimation method based on matrix multiplicative weights.
Findings
Achieved nearly-optimal statistical guarantees with $O(n^{1+ ext{epsilon}_0} d)$ runtime.
First runtime improvement for clustering mixtures of distributions in nearly two decades.
Developed a new robust mean estimation algorithm in the $ ext{alpha} o 1$ regime.
Abstract
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. Specifically, we are given a set of points in and a parameter such that an -fraction of the points in are i.i.d. samples from a well-behaved distribution and the remaining -fraction are arbitrary. The goal is to output a small list of vectors, at least one of which is close to the mean of . We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees, with running time , for any fixed . All prior algorithms for this problem had additional polynomial factors in . We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Random Matrices and Applications
