
TL;DR
This paper introduces a simplified LSTM-like RNN architecture that outperforms traditional LSTM on various tasks and presents a new benchmark for evaluating RNN performance.
Contribution
A novel, simpler RNN architecture similar to LSTM that achieves better results and a new benchmark for assessing RNN capabilities.
Findings
Proposed architecture outperforms standard LSTM on tested tasks.
New benchmark effectively stresses key RNN capabilities.
Architecture maintains simplicity and efficiency.
Abstract
Previous RNN architectures have largely been superseded by LSTM, or "Long Short-Term Memory". Since its introduction, there have been many variations on this simple design. However, it is still widely used and we are not aware of a gated-RNN architecture that outperforms LSTM in a broad sense while still being as simple and efficient. In this paper we propose a modified LSTM-like architecture. Our architecture is still simple and achieves better performance on the tasks that we tested on. We also introduce a new RNN performance benchmark that uses the handwritten digits and stresses several important network capabilities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
