Pair Correlation Factor and the Sample Complexity of Gaussian Mixtures
Farzad Aryan

TL;DR
This paper introduces the Pair Correlation Factor (PCF), a new geometric measure that better predicts the sample complexity of learning Gaussian Mixture Models than previous gap-based metrics.
Contribution
The paper proposes the PCF as a novel geometric property influencing GMM sample complexity and provides an improved algorithm with tighter bounds in the spherical case.
Findings
PCF more accurately predicts sample complexity than minimum gap
New algorithm with improved sample complexity bounds in spherical GMMs
More than samples needed depending on PCF
Abstract
We study the problem of learning Gaussian Mixture Models (GMMs) and ask: which structural properties govern their sample complexity? Prior work has largely tied this complexity to the minimum pairwise separation between components, but we demonstrate this view is incomplete. We introduce the \emph{Pair Correlation Factor} (PCF), a geometric quantity capturing the clustering of component means. Unlike the minimum gap, the PCF more accurately dictates the difficulty of parameter recovery. In the uniform spherical case, we give an algorithm with improved sample complexity bounds, showing when more than the usual samples are necessary.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
