Fast-Slow Recurrent Neural Networks
Asier Mujika, Florian Meier, Angelika Steger

TL;DR
The paper introduces the Fast-Slow RNN (FS-RNN), a novel architecture that processes sequential data on multiple timescales, achieving state-of-the-art results in character-level language modeling and demonstrating improved learning dynamics.
Contribution
It proposes the FS-RNN architecture combining multiscale and deep transition RNNs, enhancing performance and understanding of RNN dynamics across tasks.
Findings
Achieved new state-of-the-art bits-per-character on Penn Treebank (1.19 BPC)
Set new records on Hutter Prize Wikipedia (1.25 BPC)
Ensemble of FS-RNNs outperforms existing compression algorithms
Abstract
Processing sequential data of variable length is a major challenge in a wide range of applications, such as speech recognition, language modeling, generative image modeling and machine translation. Here, we address this challenge by proposing a novel recurrent neural network (RNN) architecture, the Fast-Slow RNN (FS-RNN). The FS-RNN incorporates the strengths of both multiscale RNNs and deep transition RNNs as it processes sequential data on different timescales and learns complex transition functions from one time step to the next. We evaluate the FS-RNN on two character level language modeling data sets, Penn Treebank and Hutter Prize Wikipedia, where we improve state of the art results to and bits-per-character (BPC), respectively. In addition, an ensemble of two FS-RNNs achieves BPC on Hutter Prize Wikipedia outperforming the best known compression algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Neural Networks and Reservoir Computing · Machine Learning and ELM
