Bayesian Sparsification of Gated Recurrent Neural Networks
Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov

TL;DR
This paper introduces a Bayesian sparsification method for gated recurrent neural networks, including LSTMs, which reduces complexity, speeds up computation, and enhances interpretability by sparsifying weights, neurons, and gate preactivations.
Contribution
It extends Bayesian sparsification to gate preactivations in LSTMs, leading to more efficient, interpretable, and task-dependent sparse recurrent architectures.
Findings
Sparsification speeds up forward passes.
Gate preactivation sparsification improves model compression.
The resulting sparsity structure is interpretable and task-specific.
Abstract
Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons. We apply and further develop this approach for gated recurrent architectures. Specifically, in addition to sparsification of individual weights and neurons, we propose to sparsify preactivations of gates and information flow in LSTM. It makes some gates and information flow components constant, speeds up forward pass and improves compression. Moreover, the resulting structure of gate sparsity is interpretable and depends on the task. Code is available on github: https://github.com/tipt0p/SparseBayesianRNN
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
