On the Influence of Enforcing Model Identifiability on Learning dynamics of Gaussian Mixture Models
Pascal Mattia Esser, Frank Nielsen

TL;DR
This paper introduces a reparameterization technique that enforces identifiability in singular models like GMMs, leading to improved learning dynamics, faster convergence, and better understanding of model behavior.
Contribution
The authors propose a general reparameterization method to extract regular submodels from singular models, enhancing training stability and interpretability.
Findings
Faster convergence of gradient descent and EM algorithms for GMMs.
Improved manifold shape around singularities.
Potential applicability to deep neural networks.
Abstract
A common way to learn and analyze statistical models is to consider operations in the model parameter space. But what happens if we optimize in the parameter space and there is no one-to-one mapping between the parameter space and the underlying statistical model space? Such cases frequently occur for hierarchical models which include statistical mixtures or stochastic neural networks, and these models are said to be singular. Singular models reveal several important and well-studied problems in machine learning like the decrease in convergence speed of learning trajectories due to attractor behaviors. In this work, we propose a relative reparameterization technique of the parameter space, which yields a general method for extracting regular submodels from singular models. Our method enforces model identifiability during training and we study the learning dynamics for gradient descent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Bayesian Methods and Mixture Models
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
