Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers
Junhao Xu, Xie Chen, Shoukang Hu, Jianwei Yu, Xunying Liu, Helen, Meng

TL;DR
This paper introduces a novel ADMM-based method for low-bit quantization of RNN language models, significantly reducing model size and training time while maintaining performance.
Contribution
It formulates quantized RNNLM training as an optimization problem and employs ADMM to enable effective low-bit quantization from scratch.
Findings
Achieved up to 31x model size compression.
Faster convergence, 5 times quicker than baseline binarized RNNLMs.
Effective trade-off adjustment between compression and performance.
Abstract
The high memory consumption and computational costs of Recurrent neural network language models (RNNLMs) limit their wider application on resource constrained devices. In recent years, neural network quantization techniques that are capable of producing extremely low-bit compression, for example, binarized RNNLMs, are gaining increasing research interests. Directly training of quantized neural networks is difficult. By formulating quantized RNNLMs training as an optimization problem, this paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM). This method can also flexibly adjust the trade-off between the compression rate and model performance using tied low-bit quantization tables. Experiments on two tasks: Penn Treebank (PTB), and Switchboard (SWBD) suggest the proposed ADMM quantization achieved a model size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
MethodsAlternating Direction Method of Multipliers
