Effective Quantization Approaches for Recurrent Neural Networks
Md Zahangir Alom, Adam T Moody, Naoya Maruyama, Brian C Van Essen, and, Tarek M. Taha

TL;DR
This paper introduces effective quantization methods for RNNs, including LSTM, GRU, and ConvLSTM, to reduce computational costs while maintaining performance in tasks like sentiment analysis and video prediction.
Contribution
It proposes and evaluates binary, ternary, and quaternary quantization techniques for RNNs, demonstrating their effectiveness on multiple datasets.
Findings
Quantization methods achieve comparable accuracy to full precision models.
Binary, ternary, and quaternary quantizations reduce computational complexity.
Promising results in sentiment analysis and video frame prediction tasks.
Abstract
Deep learning, and in particular Recurrent Neural Networks (RNN) have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frame generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy efficient RNN approaches are proposed for deploying solutions on special purpose hardware including Field Programming Gate Arrays (FPGAs) and mobile platforms. In this paper, we propose an effective quantization approach for Recurrent Neural Networks (RNN) techniques including Long Short Term Memory (LSTM), Gated Recurrent Units (GRU), and Convolutional Long Short Term Memory (ConvLSTM). We have implemented different quantization methods including Binary Connect {-1, 1}, Ternary Connect {-1, 0,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · ConvLSTM · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Gated Recurrent Unit
