Model Selection and Parameter Estimation of Multi-dimensional Gaussian Mixture Model
Xinyu Liu, Hai Zhang

TL;DR
This paper introduces an optimal sample complexity method for learning multi-dimensional Gaussian Mixture Models, combining spectral gap estimation, gradient-based parameter refinement, and PCA for high-dimensional data, outperforming traditional EM algorithms.
Contribution
The paper presents a minimax optimal spectral gap estimator, a data-driven initialization for gradient-based parameter estimation, and an efficient PCA-based approach for high-dimensional GMM learning.
Findings
Sample complexity matches the lower bound, confirming minimax optimality.
Proposed methods outperform EM in accuracy and speed.
Achieves parametric convergence rate of O_p(n^{-1/2}) for means estimation.
Abstract
In this paper, we study the problem of learning multi-dimensional Gaussian Mixture Models (GMMs), with a specific focus on model order selection and efficient mixing distribution estimation. We first establish an information-theoretic lower bound on the critical sample complexity required for reliable model selection. More specifically, we show that distinguishing a -component mixture from a simpler model necessitates a sample size scaling of . We then propose a thresholding-based estimation algorithm that evaluates the spectral gap of an empirical covariance matrix constructed from random Fourier measurement vectors. This parameter-free estimator operates with an efficient time complexity of , scaling linearly with the sample size. We demonstrate that the sample complexity of our method matches the established lower bound, confirming its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Machine Learning and Algorithms · Gaussian Processes and Bayesian Inference
