Improving the Gating Mechanism of Recurrent Neural Networks

Albert Gu; Caglar Gulcehre; Tom Le Paine; Matt Hoffman; Razvan Pascanu

arXiv:1910.09890·cs.NE·June 22, 2020·30 cites

Improving the Gating Mechanism of Recurrent Neural Networks

Albert Gu, Caglar Gulcehre, Tom Le Paine, Matt Hoffman, Razvan Pascanu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes simple modifications to gating mechanisms in recurrent neural networks that enhance their learnability and performance, especially in tasks requiring long-term dependencies, without adding hyperparameters.

Contribution

The authors introduce two easy-to-implement modifications to standard gating mechanisms that improve gradient flow and learnability in recurrent models.

Findings

01

Enhanced recurrent models perform better on long-term dependency tasks.

02

Modifications improve gradient propagation in saturated gating regimes.

03

Empirical results show robustness across various applications.

Abstract

Gating mechanisms are widely used in neural network models, where they allow gradients to backpropagate more easily through depth or time. However, their saturation property introduces problems of its own. For example, in recurrent models these gates need to have outputs near 1 to propagate information over long time-delays, which requires them to operate in their saturation regime and hinders gradient-based learning of the gate mechanism. We address this problem by deriving two synergistic modifications to the standard gating mechanism that are easy to implement, introduce no additional hyperparameters, and improve learnability of the gates when they are close to saturation. We show how these changes are related to and improve on alternative recently proposed gating mechanisms such as chrono initialization and Ordered Neurons. Empirically, our simple gating mechanisms robustly improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aithlab/ImprovingGate
pytorch

Videos

Improving the Gating Mechanism of Recurrent Neural Networks· slideslive

Taxonomy

TopicsNeural Networks and Applications · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications