Simplified Long Short-term Memory Recurrent Neural Networks: part I
Atra Akandeh, Fathi M. Salem

TL;DR
This paper introduces five simplified LSTM variants with fewer parameters, enabling faster training and better suitability for embedded systems, while maintaining comparable accuracy on MNIST.
Contribution
The paper proposes five parameter-reduced LSTM variants that retain performance and are more efficient for constrained environments.
Findings
Variants perform comparably to standard LSTM on MNIST.
Parameter reduction speeds up training.
Some variants maintain accuracy with ReLU nonlinearities.
Abstract
We present five variants of the standard Long Short-term Memory (LSTM) recurrent neural networks by uniformly reducing blocks of adaptive parameters in the gating mechanisms. For simplicity, we refer to these models as LSTM1, LSTM2, LSTM3, LSTM4, and LSTM5, respectively. Such parameter-reduced variants enable speeding up data training computations and would be more suitable for implementations onto constrained embedded platforms. We comparatively evaluate and verify our five variant models on the classical MNIST dataset and demonstrate that these variant models are comparable to a standard implementation of the LSTM model while using less number of parameters. Moreover, we observe that in some cases the standard LSTM's accuracy performance will drop after a number of epochs when using the ReLU nonlinearity; in contrast, however, LSTM3, LSTM4 and LSTM5 will retain their performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
