Linear pretraining in recurrent mixture density networks
Hubert Normandin-Taillon, Fr\'ed\'eric Godin, Chun Wang

TL;DR
This paper introduces a pretraining method for recurrent mixture density networks that enhances training stability, prevents NaN issues, and improves performance beyond linear GARCH models.
Contribution
It proposes a linear pretraining approach for RMDNs and a slight architecture modification, leading to better training robustness and performance.
Findings
Pretraining reduces NaN occurrences during training.
The method improves RMDN performance over GARCH models.
Architecture modification enhances model stability.
Abstract
We present a method for pretraining a recurrent mixture density network (RMDN). We also propose a slight modification to the architecture of the RMDN-GARCH proposed by Nikolaev et al. [2012]. The pretraining method helps the RMDN avoid bad local minima during training and improves its robustness to the persistent NaN problem, as defined by Guillaumes [2017], which is often encountered with mixture density networks. Such problem consists in frequently obtaining "Not a number" (NaN) values during training. The pretraining method proposed resolves these issues by training the linear nodes in the hidden layer of the RMDN before starting including non-linear node updates. Such an approach improves the performance of the RMDN and ensures it surpasses that of the GARCH model, which is the RMDN's linear counterpart.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Neural Networks and Applications
