Exploring the Promise and Limits of Real-Time Recurrent Learning
Kazuki Irie, Anand Gopalakrishnan, J\"urgen Schmidhuber

TL;DR
This paper investigates the practical application of real-time recurrent learning (RTRL) for sequence-processing RNNs, demonstrating its competitiveness in complex environments when combined with specific architectures and addressing its limitations.
Contribution
It explores the use of RTRL in realistic settings with well-known architectures, showing competitive results and discussing its limitations in multi-layer scenarios.
Findings
RTRL can be effectively used in complex environments with specific architectures.
Our system trained on fewer frames outperforms some baselines trained on more data.
We identify key limitations of RTRL in multi-layer neural networks.
Abstract
Real-time recurrent learning (RTRL) for sequence-processing recurrent neural networks (RNNs) offers certain conceptual advantages over backpropagation through time (BPTT). RTRL requires neither caching past activations nor truncating context, and enables online learning. However, RTRL's time and space complexity make it impractical. To overcome this problem, most recent work on RTRL focuses on approximation theories, while experiments are often limited to diagnostic settings. Here we explore the practical promise of RTRL in more realistic settings. We study actor-critic methods that combine RTRL and policy gradients, and test them in several subsets of DMLab-30, ProcGen, and Atari-2600 environments. On DMLab memory tasks, our system trained on fewer than 1.2 B environmental frames is competitive with or outperforms well-known IMPALA and R2D2 baselines trained on 10 B frames. To scale to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Neural Networks and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Test · Convolution · Max Pooling · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Gradient Clipping · V-trace · Experience Replay
