Restricted Recurrent Neural Networks

Enmao Diao; Jie Ding; Vahid Tarokh

arXiv:1908.07724·cs.CL·May 12, 2020

Restricted Recurrent Neural Networks

Enmao Diao, Jie Ding, Vahid Tarokh

PDF

1 Repo

TL;DR

This paper introduces Restricted Recurrent Neural Networks (RRNNs), a parameter-efficient architecture that reduces the size of RNNs by sharing weights, achieving comparable or better performance with fewer parameters in language modeling.

Contribution

The paper proposes RRNN, a novel RNN compression method that shares parameters across time steps without pre-training, improving efficiency while maintaining or enhancing performance.

Findings

01

RRNN achieves about 50% parameter reduction.

02

Restricted LSTM outperforms classical LSTM with fewer parameters.

03

Performance remains comparable or better despite compression.

Abstract

Recurrent Neural Network (RNN) and its variations such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), have become standard building blocks for learning online data of sequential nature in many research areas, including natural language processing and speech data analysis. In this paper, we present a new methodology to significantly reduce the number of parameters in RNNs while maintaining performance that is comparable or even better than classical RNNs. The new proposal, referred to as Restricted Recurrent Neural Network (RRNN), restricts the weight matrices corresponding to the input data and hidden states at each time step to share a large proportion of parameters. The new architecture can be regarded as a compression of its classical counterpart, but it does not require pre-training or sophisticated parameter fine-tuning, both of which are major issues in most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

diaoenmao/Restricted-Recurrent-Neural-Networks
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory