Recurrent Memory Array Structures

Kamil Rocki

arXiv:1607.03085·cs.LG·October 25, 2016·5 cites

Recurrent Memory Array Structures

Kamil Rocki

PDF

Open Access 2 Repos

TL;DR

This paper proposes augmenting LSTM with multiple memory cells per unit, introducing deterministic and stochastic variants, resulting in improved text prediction performance on enwik8 and establishing new baseline results.

Contribution

It introduces Array-LSTM, a novel architecture with multiple memory cells per hidden unit, enhancing generalization and achieving state-of-the-art results in character-level text prediction.

Findings

01

Array-LSTM achieves 1.402 BPC on enwik8.

02

Baseline results of 1.12 BPC on enwik9.

03

Baseline results of 1.19 BPC on enwik10.

Abstract

The following report introduces ideas augmenting standard Long Short Term Memory (LSTM) architecture with multiple memory cells per hidden unit in order to improve its generalization capabilities. It considers both deterministic and stochastic variants of memory operation. It is shown that the nondeterministic Array-LSTM approach improves state-of-the-art performance on character level text prediction achieving 1.402 BPC on enwik8 dataset. Furthermore, this report estabilishes baseline neural-based results of 1.12 BPC and 1.19 BPC for enwik9 and enwik10 datasets respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Neural Networks and Applications · Gaussian Processes and Bayesian Inference