Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks
Sina Khajehabdollahi, Roxana Zeraati, Emmanouil Giannakakis, Tim Jakob, Sch\"afer, Georg Martius, Anna Levina

TL;DR
This study investigates how training curriculum influences the development of long timescales in recurrent neural networks, revealing different mechanisms for memory retention and improved task performance.
Contribution
It demonstrates that training curriculum determines whether RNNs develop longer neuron timescales or rely on recurrent connectivity, enhancing memory task performance and generalization.
Findings
Multi-head curriculum maintains constant neuron timescales while developing longer network timescales.
Multi-head training improves speed, stability, and generalization of RNNs on memory tasks.
Adapting timescales via recurrent interactions enhances learning of complex temporal objectives.
Abstract
Recurrent neural networks (RNNs) in the brain and in silico excel at solving tasks with intricate temporal dependencies. Long timescales required for solving such tasks can arise from properties of individual neurons (single-neuron timescale, , e.g., membrane time constant in biological neurons) or recurrent interactions among them (network-mediated timescale). However, the contribution of each mechanism for optimally solving memory-dependent tasks remains poorly understood. Here, we train RNNs to solve -parity and -delayed match-to-sample tasks with increasing memory requirements controlled by by simultaneously optimizing recurrent weights and s. We find that for both tasks RNNs develop longer timescales with increasing , but depending on the learning objective, they use different mechanisms. Two distinct curricula define learning objectives: sequential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Domain Adaptation and Few-Shot Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
