Learning Sparse Low-Precision Neural Networks With Learnable   Regularization

Yoojin Choi; Mostafa El-Khamy; Jungwon Lee

arXiv:1809.00095·cs.CV·May 26, 2020

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Yoojin Choi, Mostafa El-Khamy, Jungwon Lee

PDF

TL;DR

This paper introduces a learnable regularization approach to train low-precision neural networks, improving accuracy and compression ratios by aligning high-precision weights with their quantized counterparts.

Contribution

It proposes a novel MSQE regularizer with a learnable coefficient and integrates weight pruning, quantization, and entropy coding for effective low-precision DNN compression.

Findings

01

Achieved state-of-the-art compression ratios of 7.13 and 6.79 on ImageNet with MobileNet and ShuffleNet.

02

Produced 8-bit low-precision models for super-resolution with negligible performance loss.

03

Enhanced training convergence and accuracy of low-precision neural networks.

Abstract

We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision weights and activations are used in the forward pass to calculate the loss function for training. Thus, the gradient descent becomes suboptimal, and accuracy loss follows. In order to reduce the mismatch in the forward and backward passes, we utilize mean squared quantization error (MSQE) regularization. In particular, we propose using a learnable regularization coefficient with the MSQE regularizer to reinforce the convergence of high-precision weights to their quantized values. We also investigate how partial L2 regularization can be employed for weight pruning in a similar manner. Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning · 1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax