The interplay of signal-to-noise ratio and variance misspecification in Gaussian mixtures
Vladimir Serov, Amnon Balanov, and Tamir Bendory

TL;DR
This paper analyzes how variance misspecification affects estimation and clustering in Gaussian mixture models, revealing phase transitions and the impact of SNR on bias and cluster recovery.
Contribution
It characterizes the effects of variance misspecification on Gaussian mixture models, deriving phase diagrams and asymptotic behaviors under different SNR regimes.
Findings
Under correct specification, MLE recovers true means regardless of SNR.
Variance misspecification causes mean displacement or cluster collapse depending on smoothing.
Low SNR leads to near-random clustering, with bias from misspecification and hard assignments.
Abstract
We study estimation and clustering in Gaussian mixture models under variance misspecification. Observations are generated with true variance , while the component means are estimated using a likelihood with variance , yielding a family of mismatched likelihood functions parameterized by the ratio . We show that the interplay between and the signal-to-noise ratio (SNR) induces a sharp phase diagram. Under correct specification (), maximum likelihood recovers the true means, independently of the SNR. However, once the model is misspecified, two different regimes emerge. Under under-smoothing (), the estimated Gaussian means are displaced from the truth, and in low SNR this discrepancy grows as the SNR decreases: for every fixed , the squared error scales as . Under over-smoothing (), the fitted…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
