Recurrent Memory Array Structures
Kamil Rocki

TL;DR
This paper proposes augmenting LSTM with multiple memory cells per unit, introducing deterministic and stochastic variants, resulting in improved text prediction performance on enwik8 and establishing new baseline results.
Contribution
It introduces Array-LSTM, a novel architecture with multiple memory cells per hidden unit, enhancing generalization and achieving state-of-the-art results in character-level text prediction.
Findings
Array-LSTM achieves 1.402 BPC on enwik8.
Baseline results of 1.12 BPC on enwik9.
Baseline results of 1.19 BPC on enwik10.
Abstract
The following report introduces ideas augmenting standard Long Short Term Memory (LSTM) architecture with multiple memory cells per hidden unit in order to improve its generalization capabilities. It considers both deterministic and stochastic variants of memory operation. It is shown that the nondeterministic Array-LSTM approach improves state-of-the-art performance on character level text prediction achieving 1.402 BPC on enwik8 dataset. Furthermore, this report estabilishes baseline neural-based results of 1.12 BPC and 1.19 BPC for enwik9 and enwik10 datasets respectively.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
