Variable Computation in Recurrent Neural Networks

Yacine Jernite; Edouard Grave; Armand Joulin; Tomas Mikolov

arXiv:1611.06188·stat.ML·March 6, 2017·28 cites

Variable Computation in Recurrent Neural Networks

Yacine Jernite, Edouard Grave, Armand Joulin, Tomas Mikolov

PDF

Open Access

TL;DR

This paper introduces a modification to recurrent neural networks that enables them to adapt their computation at each step, improving efficiency and performance on sequential data tasks.

Contribution

It proposes a novel approach allowing RNNs to learn variable computation per step without prior sequence structure knowledge.

Findings

01

Models perform fewer operations while maintaining accuracy

02

Variable computation improves overall task performance

03

Enhanced efficiency over traditional fixed-computation RNNs

Abstract

Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data. Much of this progress has been achieved through devising recurrent units and architectures with the flexibility to capture complex statistics in the data, such as long range dependency or localized attention phenomena. However, while many sequential data (such as video, speech or language) can have highly variable information flow, most recurrent models still consume input features at a constant rate and perform a constant number of computations per time step, which can be detrimental to both speed and model capacity. In this paper, we explore a modification to existing recurrent units which allows them to learn to vary the amount of computation they perform at each step, without prior knowledge of the sequence's time structure. We show experimentally that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications