On the Role of Channel Capacity in Learning Gaussian Mixture Models
Elad Romanov, Tamir Bendory, Or Ordentlich

TL;DR
This paper investigates the sample complexity of learning Gaussian mixture models, revealing a sharp noise threshold linked to channel capacity that determines when the problem becomes significantly harder.
Contribution
It characterizes the exact noise level threshold where GMM learning transitions from easy to hard, connecting it to the capacity of the AWGN channel.
Findings
Identifies the noise threshold at =, related to channel capacity.
Shows the difficulty of GMM learning is connected to decoding error probability.
Suggests the statistical difficulty depends on channel coding error rates, not just minimum distance.
Abstract
This paper studies the sample complexity of learning the unknown centers of a balanced Gaussian mixture model (GMM) in with spherical covariance matrix . In particular, we are interested in the following question: what is the maximal noise level , for which the sample complexity is essentially the same as when estimating the centers from labeled measurements? To that end, we restrict attention to a Bayesian formulation of the problem, where the centers are uniformly distributed on the sphere . Our main results characterize the exact noise threshold below which the GMM learning problem, in the large system limit , is as easy as learning from labeled observations, and above which it is substantially harder. The threshold occurs at $\frac{\log k}{d} = \frac12\log\left( 1+\frac{1}{\sigma^2}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference
