Learnability Window in Gated Recurrent Neural Networks
Lorenzo Livi

TL;DR
This paper presents a statistical theory for understanding the limits of temporal learning in gated RNNs, linking gating mechanisms and optimizer effects to the maximum recoverable temporal horizon.
Contribution
It introduces the effective learning rate envelope and derives explicit scaling laws for the learnability window under different noise conditions.
Findings
Slower envelope decay increases the learnability window.
Heavy-tailed noise reduces the temporal horizon due to weaker concentration.
Experiments confirm the theoretical predictions across architectures and optimizers.
Abstract
We develop a statistical theory of temporal learnability in recurrent neural networks, quantifying the maximal temporal horizon over which gradient-based learning can recover lag-dependent structure at finite sample size . The theory is built on the effective learning rate envelope , a functional that captures how gating mechanisms and adaptive optimizers jointly shape the coupling between state-space transport and parameter updates during Backpropagation Through Time. Under heavy-tailed (-stable) fluctuations, where empirical averages concentrate at rate with , the interplay between envelope decay and statistical concentration yields explicit scaling laws for the growth of : logarithmic, polynomial, and exponential temporal learning regimes emerge according to the decay law of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
