A Random-Matrix Criterion for Initializing Gated Recurrent Neural Networks
Tommaso Fioratti, Riccardo Marcaccioli, Francesco Casola

TL;DR
This paper derives a simple criterion to identify the critical weight variance in gated RNNs, linking it to optimal performance and providing a new design principle for initialization.
Contribution
It introduces a novel criterion for initializing gated RNNs at the critical point, enhancing stability and performance in reservoir computing.
Findings
The critical weight variance closely matches the gain for peak performance.
The criterion effectively predicts the transition between ordered and chaotic dynamics.
It offers a practical guideline for initialization in recurrent neural networks.
Abstract
Proper weight initialization prior to training has historically been one of the key factors that helped kick off the deep learning revolution. Initialization is even more crucial in "reservoir computing", where the weights of a readout layer are learned linearly while the reservoir weights are fixed and largely determine the richness, stability and memory of the resulting dynamics. In the infinite-width limit it has been shown that meaningful initializations are those sitting at an effective critical point of the randomly initialized model. The phase transition is controlled by the weight variance and separates an ordered phase from a chaotic one where information progressively degrades. Here we derive a simple criterion to estimate the critical for a broad class of recurrent architectures and we show that it closely tracks the gain at which a gated-RNN reservoir achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
