Recurrent Additive Networks

Kenton Lee; Omer Levy; Luke Zettlemoyer

arXiv:1705.07393·cs.CL·June 30, 2017·30 cites

Recurrent Additive Networks

Kenton Lee, Omer Levy, Luke Zettlemoyer

PDF

Open Access 2 Repos

TL;DR

Recurrent Additive Networks (RANs) are a new type of gated RNN that use purely additive state updates, performing comparably to LSTMs on language modeling tasks, challenging assumptions about the necessity of non-linearities.

Contribution

This paper introduces RANs, a simple additive gated RNN architecture, and demonstrates their effectiveness, showing non-linearities may not be essential for certain language modeling tasks.

Findings

01

RAN states are weighted sums of input vectors.

02

RAN performs on par with LSTMs on benchmark language modeling.

03

Gates primarily determine the weights in the additive sums.

Abstract

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Algorithms · Ferroelectric and Negative Capacitance Devices