A Lightweight Recurrent Network for Sequence Modeling

Biao Zhang; Rico Sennrich

arXiv:1905.13324·cs.CL·June 3, 2019·1 cites

A Lightweight Recurrent Network for Sequence Modeling

Biao Zhang, Rico Sennrich

PDF

1 Repo

TL;DR

This paper introduces a lightweight recurrent network (LRN) that improves computational efficiency by shifting heavy computations outside the recurrence, maintaining performance across NLP tasks.

Contribution

The paper proposes a novel LRN architecture that reduces computational complexity by externalizing parameter calculations, closely linking it with self-attention mechanisms.

Findings

01

LRN achieves superior efficiency compared to traditional recurrent networks.

02

LRN maintains comparable performance on six NLP tasks.

03

Extensive experiments validate the effectiveness of LRN as a drop-in replacement.

Abstract

Recurrent networks have achieved great success on various sequential tasks with the assistance of complex recurrent units, but suffer from severe computational inefficiency due to weak parallelization. One direction to alleviate this issue is to shift heavy computations outside the recurrence. In this paper, we propose a lightweight recurrent network, or LRN. LRN uses input and forget gates to handle long-range dependencies as well as gradient vanishing and explosion, with all parameter related calculations factored outside the recurrence. The recurrence in LRN only manipulates the weight assigned to each token, tightly connecting LRN with self-attention networks. We apply LRN as a drop-in replacement of existing recurrent units in several neural sequential models. Extensive experiments on six NLP tasks show that LRN yields the best running efficiency with little or no loss in model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bzhangGo/lrn
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Time Series Analysis and Forecasting · Natural Language Processing Techniques