Learning fixed points of recurrent neural networks by reparameterizing the network model
Vicky Zhu, Robert Rosenbaum

TL;DR
This paper introduces reparameterization techniques for training recurrent neural networks to better find fixed points, addressing issues with traditional gradient descent and suggesting non-Euclidean metrics improve learning robustness.
Contribution
The paper proposes a novel reparameterization approach that yields two alternative learning rules, enhancing the robustness of fixed point training in recurrent neural networks.
Findings
Reparameterization leads to more stable learning dynamics.
Non-Euclidean gradient descent improves convergence.
Traditional Euclidean gradient descent can encounter singularities.
Abstract
In computational neuroscience, fixed points of recurrent neural networks are commonly used to model neural responses to static or slowly changing stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. A natural approach is to use gradient descent on the Euclidean space of synaptic weights. We show that this approach can lead to poor learning performance due, in part, to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produces more robust learning dynamics. We show that these learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Neural dynamics and brain function · Advanced Memory and Neural Computing
