SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

Julian Faraone; Nicholas Fraser; Michaela Blott; Philip H.W. Leong

arXiv:1807.00301·cs.CV·July 3, 2018

SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

Julian Faraone, Nicholas Fraser, Michaela Blott, Philip H.W. Leong

PDF

1 Repo

TL;DR

This paper presents SYQ, a symmetric quantization method that learns codebooks for weight subgroups in neural networks, significantly improving accuracy at very low precisions while maintaining hardware efficiency.

Contribution

Introduces a symmetric quantization approach that learns codebooks for weight subgroups, reducing accuracy loss in low-precision neural network quantization.

Findings

01

Symmetric quantization improves accuracy for binary and ternary networks.

02

The method maintains hardware simplicity for low-precision representations.

03

Empirical results show significant accuracy gains with minimal hardware impact.

Abstract

Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with 1-8-bit activations, the information loss from quantization leads to significant accuracy degradation due to large gradient mismatches between the forward and backward functions. In this paper, we introduce a quantization method to reduce this loss by learning a symmetric codebook for particular weight subgroups. These subgroups are determined based on their locality in the weight matrix, such that the hardware simplicity of the low-precision representations is preserved. Empirically, we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

julianfaraone/SYQ
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.