A Random-Matrix Criterion for Initializing Gated Recurrent Neural Networks

Tommaso Fioratti; Riccardo Marcaccioli; Francesco Casola

arXiv:2605.10650·cs.LG·May 12, 2026

A Random-Matrix Criterion for Initializing Gated Recurrent Neural Networks

Tommaso Fioratti, Riccardo Marcaccioli, Francesco Casola

PDF

TL;DR

This paper derives a simple criterion to identify the critical weight variance in gated RNNs, linking it to optimal performance and providing a new design principle for initialization.

Contribution

It introduces a novel criterion for initializing gated RNNs at the critical point, enhancing stability and performance in reservoir computing.

Findings

01

The critical weight variance closely matches the gain for peak performance.

02

The criterion effectively predicts the transition between ordered and chaotic dynamics.

03

It offers a practical guideline for initialization in recurrent neural networks.

Abstract

Proper weight initialization prior to training has historically been one of the key factors that helped kick off the deep learning revolution. Initialization is even more crucial in "reservoir computing", where the weights of a readout layer are learned linearly while the reservoir weights are fixed and largely determine the richness, stability and memory of the resulting dynamics. In the infinite-width limit it has been shown that meaningful initializations are those sitting at an effective critical point of the randomly initialized model. The phase transition is controlled by the weight variance $g^{2}$ and separates an ordered phase from a chaotic one where information progressively degrades. Here we derive a simple criterion to estimate the critical $g_{c}$ for a broad class of recurrent architectures and we show that it closely tracks the gain at which a gated-RNN reservoir achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.