Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

Victor Campos; Brendan Jou; Xavier Giro-i-Nieto; Jordi Torres and; Shih-Fu Chang

arXiv:1708.06834·cs.AI·February 6, 2018·38 cites

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

Victor Campos, Brendan Jou, Xavier Giro-i-Nieto, Jordi Torres and, Shih-Fu Chang

PDF

Open Access 3 Repos

TL;DR

This paper introduces Skip RNN, a model that learns to skip state updates in recurrent neural networks, reducing computational cost while maintaining or improving performance on sequence tasks.

Contribution

The paper proposes a novel Skip RNN model that learns to skip state updates, effectively reducing computational complexity and addressing training challenges in long sequence modeling.

Findings

01

Reduces the number of RNN updates needed

02

Maintains or improves performance on sequence tasks

03

Demonstrates efficiency gains in various experiments

Abstract

Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks. However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies. In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time. We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph. This model can also be encouraged to perform fewer state updates through a budget constraint. We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models. Source code is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning