Investigating Sparsity in Recurrent Neural Networks
Harshil Darji

TL;DR
This paper investigates the impact of sparsity techniques, including pruning and arbitrary structure embedding, on the performance of various RNN architectures, filling a research gap in sparse RNN design.
Contribution
It is the first comprehensive study comparing pruning and arbitrary structure embedding effects on different RNN types.
Findings
Pruning affects RNN performance and requires multiple epochs for recovery.
Sparse RNNs with arbitrary structures show performance related to graph properties.
Different RNN architectures respond uniquely to sparsity techniques.
Abstract
In the past few years, neural networks have evolved from simple Feedforward Neural Networks to more complex neural networks, such as Convolutional Neural Networks and Recurrent Neural Networks. Where CNNs are a perfect fit for tasks where the sequence is not important such as image recognition, RNNs are useful when order is important such as machine translation. An increasing number of layers in a neural network is one way to improve its performance, but it also increases its complexity making it much more time and power-consuming to train. One way to tackle this problem is to introduce sparsity in the architecture of the neural network. Pruning is one of the many methods to make a neural network architecture sparse by clipping out weights below a certain threshold while keeping the performance near to the original. Another way is to generate arbitrary structures using random graphs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Pruning · Long Short-Term Memory · Gated Recurrent Unit · *Communicated@Fast*How Do I Communicate to Expedia?
