Efficient LSTM Training with Eligibility Traces

Michael Hoyer; Shahram Eivazi; Sebastian Otte

arXiv:2209.15502·cs.LG·October 3, 2022

Efficient LSTM Training with Eligibility Traces

Michael Hoyer, Shahram Eivazi, Sebastian Otte

PDF

TL;DR

This paper explores the application of e-prop, a biologically plausible alternative to BPTT, for training LSTMs in supervised and reinforcement learning, demonstrating competitive or superior performance with extensions.

Contribution

It demonstrates that e-prop can effectively train LSTMs on long sequences and introduces extensions that enhance its performance, including a proof of concept for RL integration.

Findings

01

e-prop is suitable for long-sequence LSTM training

02

Extensions improve e-prop performance and can outperform BPTT in some cases

03

Successful integration of e-prop into deep RL (Q-learning)

Abstract

Training recurrent neural networks is predominantly achieved via backpropagation through time (BPTT). However, this algorithm is not an optimal solution from both a biological and computational perspective. A more efficient and biologically plausible alternative for BPTT is e-prop. We investigate the applicability of e-prop to long short-term memorys (LSTMs), for both supervised and reinforcement learning (RL) tasks. We show that e-prop is a suitable optimization algorithm for LSTMs by comparing it to BPTT on two benchmarks for supervised learning. This proves that e-prop can achieve learning even for problems with long sequences of several hundred timesteps. We introduce extensions that improve the performance of e-prop, which can partially be applied to other network architectures. With the help of these extensions we show that, under certain conditions, e-prop can outperform BPTT for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.