Structured Sparsification of Gated Recurrent Neural Networks
Ekaterina Lobacheva, Nadezhda Chirkova, Alexander Markovich, Dmitry, Vetrov

TL;DR
This paper introduces a novel sparsification method for gated recurrent neural networks, including LSTMs, which simplifies their structure by sparsifying weights, neurons, and gate preactivations, leading to improved compression and task-specific gate structures.
Contribution
It extends existing sparsification techniques to gated RNNs by including gate preactivation sparsification, resulting in more efficient models with task-dependent structures.
Findings
Gate sparsity varies with the task.
The method improves neuron-wise compression.
Simplifies LSTM structure without significant performance loss.
Abstract
Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e.g. neurons. We adjust the existing sparsification approaches to the gated recurrent architectures. Specifically, in addition to the sparsification of weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies LSTM structure. We test our approach on the text classification and language modeling tasks. We observe that the resulting structure of gate sparsity depends on the task and connect the learned structure to the specifics of the particular tasks. Our method also improves neuron-wise compression of the model in most of the tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
