Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks
Yessin Moakher (X), Malik Tiomoko, Cosme Louart (CUHK-Shenzhen), Zhenyu Liao (HUST)

TL;DR
This paper uses Random Matrix Theory to analyze the generalization error of fixed feature representation models, including recurrent networks, revealing their performance characteristics and biases in different data regimes.
Contribution
It introduces a novel theoretical framework applying Random Matrix Theory to derive explicit formulas for generalization error in high-dimensional models, including recurrent networks.
Findings
ESNs excel in low-sample, short-memory scenarios
Ridge regression performs better with more data or long-range dependencies
Linear ESNs have an inductive bias toward recent inputs
Abstract
We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (''memory'') input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
