Fast-Slow Recurrent Neural Networks

Asier Mujika; Florian Meier; Angelika Steger

arXiv:1705.08639·cs.NE·June 13, 2017·41 cites

Fast-Slow Recurrent Neural Networks

Asier Mujika, Florian Meier, Angelika Steger

PDF

Open Access 1 Repo

TL;DR

The paper introduces the Fast-Slow RNN (FS-RNN), a novel architecture that processes sequential data on multiple timescales, achieving state-of-the-art results in character-level language modeling and demonstrating improved learning dynamics.

Contribution

It proposes the FS-RNN architecture combining multiscale and deep transition RNNs, enhancing performance and understanding of RNN dynamics across tasks.

Findings

01

Achieved new state-of-the-art bits-per-character on Penn Treebank (1.19 BPC)

02

Set new records on Hutter Prize Wikipedia (1.25 BPC)

03

Ensemble of FS-RNNs outperforms existing compression algorithms

Abstract

Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation. Here, we address this challenge by proposing a novel recurrent neural network (RNN) architecture, the Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both multiscale RNNs and deep transition RNNs as it processes sequential data on different timescales and learns complex transition functions from one time step to the next. We evaluate the FS-RNN on two character level language modeling data sets, Penn Treebank and Hutter Prize Wikipedia, where we improve state of the art results to $1.19$ and $1.25$ bits-per-character (BPC), respectively. In addition, an ensemble of two FS-RNNs achieves $1.20$ BPC on Hutter Prize Wikipedia outperforming the best known compression algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amujika/Fast-Slow-LSTM
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Neural Networks and Reservoir Computing · Machine Learning and ELM