Scalable Real-Time Recurrent Learning Using Columnar-Constructive   Networks

Khurram Javed; Haseeb Shah; Rich Sutton; Martha White

arXiv:2302.05326·cs.LG·November 23, 2023·1 cites

Scalable Real-Time Recurrent Learning Using Columnar-Constructive Networks

Khurram Javed, Haseeb Shah, Rich Sutton, Martha White

PDF

Open Access 1 Repo

TL;DR

This paper introduces scalable methods for real-time recurrent learning that maintain gradient accuracy without noise, enabling online updates for large networks in reinforcement learning tasks.

Contribution

It proposes two novel constraints that make RTRL scalable, allowing linear scaling with parameters without adding noise or bias, unlike previous methods.

Findings

01

Outperforms Truncated-BPTT on a prediction benchmark.

02

Effective in policy evaluation for Atari 2600 games.

03

Scales linearly with network size, enabling large-scale online learning.

Abstract

Constructing states from sequences of observations is an important component of reinforcement learning agents. One solution for state construction is to use recurrent neural networks. Back-propagation through time (BPTT), and real-time recurrent learning (RTRL) are two popular gradient-based methods for recurrent learning. BPTT requires complete trajectories of observations before it can compute the gradients and is unsuitable for online updates. RTRL can do online updates but scales poorly to large networks. In this paper, we propose two constraints that make RTRL scalable. We show that by either decomposing the network into independent modules or learning the network in stages, we can make RTRL scale linearly with the number of parameters. Unlike prior scalable gradient estimation algorithms, such as UORO and Truncated-BPTT, our algorithms do not add noise or bias to the gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

khurramjaved96/atari-prediction-benchmark
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Reinforcement Learning in Robotics · Model Reduction and Neural Networks

MethodsUnbiased Online Recurrent Optimization