Investigating Sparsity in Recurrent Neural Networks

Harshil Darji

arXiv:2407.20601·cs.LG·July 31, 2024

Investigating Sparsity in Recurrent Neural Networks

Harshil Darji

PDF

1 Repo 1 Datasets

TL;DR

This paper investigates the impact of sparsity techniques, including pruning and arbitrary structure embedding, on the performance of various RNN architectures, filling a research gap in sparse RNN design.

Contribution

It is the first comprehensive study comparing pruning and arbitrary structure embedding effects on different RNN types.

Findings

01

Pruning affects RNN performance and requires multiple epochs for recovery.

02

Sparse RNNs with arbitrary structures show performance related to graph properties.

03

Different RNN architectures respond uniquely to sparsity techniques.

Abstract

In the past few years, neural networks have evolved from simple Feedforward Neural Networks to more complex neural networks, such as Convolutional Neural Networks and Recurrent Neural Networks. Where CNNs are a perfect fit for tasks where the sequence is not important such as image recognition, RNNs are useful when order is important such as machine translation. An increasing number of layers in a neural network is one way to improve its performance, but it also increases its complexity making it much more time and power-consuming to train. One way to tackle this problem is to introduce sparsity in the architecture of the neural network. Pruning is one of the many methods to make a neural network architecture sparse by clipping out weights below a certain threshold while keeping the performance near to the original. Another way is to generate arbitrary structures using random graphs and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

harshildarji/thesis
pytorchOfficial

Datasets

harshildarji/Reber-Grammar
dataset· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Pruning · Long Short-Term Memory · Gated Recurrent Unit · *Communicated@Fast*How Do I Communicate to Expedia?