Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning
Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders, Martinsson, Angelika Steger

TL;DR
This paper introduces the Optimal Kronecker-Sum Approximation (OK), a new method for approximating Real Time Recurrent Learning (RTRL) that achieves optimality and reduces noise, enabling better long-term dependency learning in RNNs.
Contribution
The paper proposes the OK algorithm, proving its optimality for RTRL approximations and demonstrating its effectiveness in real-world and synthetic tasks.
Findings
OK matches TBPTT in character-level Penn TreeBank task.
OK outperforms TBPTT in a synthetic string memorization task.
OK has empirically negligible noise compared to previous methods.
Abstract
One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs. Recently published approaches reduce these costs by providing noisy approximations of RTRL. We present a new approximation algorithm of RTRL, Optimal Kronecker-Sum Approximation (OK). We prove that OK is optimal for a class of approximations of RTRL, which includes all approaches published so far. Additionally, we show that OK has empirically negligible noise: Unlike previous algorithms it matches TBPTT in a real world task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Time Series Analysis and Forecasting · Machine Learning and Data Classification
