Technical report: Training Mixture Density Networks with full covariance matrices
Jakob Kruse

TL;DR
This technical report presents a formulation of Mixture Density Networks with full covariance matrices, enabling more flexible modeling of joint distributions in conditional probability tasks.
Contribution
The paper introduces an MDN variant with unrestricted covariance matrices, addressing a limitation in standard MDNs for modeling variable dependencies.
Findings
Demonstrates implementation of MDNs with full covariance matrices
Provides documentation for this approach for community use
Highlights potential applications in complex conditional modeling
Abstract
Mixture Density Networks are a tried and tested tool for modelling conditional probability distributions. As such, they constitute a great baseline for novel approaches to this problem. In the standard formulation, an MDN takes some input and outputs parameters for a Gaussian mixture model with restrictions on the mixture components' covariance. Since covariance between random variables is a central issue in the conditional modeling problems we were investigating, I derived and implemented an MDN formulation with unrestricted covariances. It is likely that this has been done before, but I could not find any resources online. For this reason, I have documented my approach in the form of this technical report, in hopes that it may be useful to others facing a similar situation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference · Bayesian Modeling and Causal Inference
