Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules
Kazuki Irie, Francesco Faccio, J\"urgen Schmidhuber

TL;DR
This paper introduces a novel approach combining neural ODEs with learning rules to create continuous-time sequence processing networks that outperform existing models on time series tasks.
Contribution
It presents a new integration of learning rules with Neural ODEs to model short-term memory in neural networks, addressing scalability issues.
Findings
Outperforms existing Neural CDE models on time series classification
Addresses scalability limitations of previous models
Provides publicly available code for the proposed models
Abstract
Neural ordinary differential equations (ODEs) have attracted much attention as continuous-time counterparts of deep residual neural networks (NNs), and numerous extensions for recurrent NNs have been proposed. Since the 1980s, ODEs have also been used to derive theoretical results for NN learning rules, e.g., the famous connection between Oja's rule and principal component analysis. Such rules are typically expressed as additive iterative update processes which have straightforward ODE counterparts. Here we introduce a novel combination of learning rules and Neural ODEs to build continuous-time sequence processing nets that learn to manipulate short-term memory in rapidly changing synaptic connections of other nets. This yields continuous-time counterparts of Fast Weight Programmers and linear Transformers. Our novel models outperform the best existing Neural Controlled Differential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Neural Networks and Reservoir Computing
