An Efficient and Effective Second-Order Training Algorithm for   LSTM-based Adaptive Learning

N. Mert Vural; Salih Erg\"ut; Suleyman S. Kozat

arXiv:1910.09857·cs.LG·June 1, 2021

An Efficient and Effective Second-Order Training Algorithm for LSTM-based Adaptive Learning

N. Mert Vural, Salih Erg\"ut, Suleyman S. Kozat

PDF

1 Repo

TL;DR

This paper introduces a new second-order training algorithm based on an Extended Kalman filter for LSTM networks, achieving significant accuracy improvements and faster training compared to existing methods in adaptive learning tasks.

Contribution

The paper presents an efficient EKF-based second-order training algorithm for LSTM networks that is fully online and outperforms state-of-the-art adaptive methods in accuracy and speed.

Findings

01

10-45% accuracy improvement over Adam, RMSprop, and DEKF

02

10-15 times faster than EKF in training

03

Consistent performance gains across experiments

Abstract

We study adaptive (or online) nonlinear regression with Long-Short-Term-Memory (LSTM) based networks, i.e., LSTM-based adaptive learning. In this context, we introduce an efficient Extended Kalman filter (EKF) based second-order training algorithm. Our algorithm is truly online, i.e., it does not assume any underlying data generating process and future information, except that the target sequence is bounded. Through an extensive set of experiments, we demonstrate significant performance gains achieved by our algorithm with respect to the state-of-the-art methods. Here, we mainly show that our algorithm consistently provides 10 to 45\% improvement in the accuracy compared to the widely-used adaptive methods Adam, RMSprop, and DEKF, and comparable performance to EKF with a 10 to 15 times reduction in the run-time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nurimertvural/EfficientEffectiveLSTM
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAdam · Sigmoid Activation · Tanh Activation · Long Short-Term Memory