Towards Efficient Training for Neural Network Quantization

Qing Jin; Linjie Yang; Zhenyu Liao

arXiv:1912.10207·cs.CV·December 24, 2019·33 cites

Towards Efficient Training for Neural Network Quantization

Qing Jin, Linjie Yang, Zhenyu Liao

PDF

Open Access 3 Repos

TL;DR

This paper investigates the causes of accuracy loss in neural network quantization, identifies critical training rules, and introduces scale-adjusted training (SAT) to improve quantization performance, achieving state-of-the-art results.

Contribution

It proposes the SAT technique based on new insights into gradient propagation, enhancing training efficiency and accuracy in neural network quantization.

Findings

01

SAT improves quantization accuracy across models

02

Quantized models outperform full-precision counterparts in experiments

03

Analysis of gradient-calibrated PACT reduces quantization error

Abstract

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the gradient propagation process of neural networks by viewing the weights and intermediate activations as random variables, we discover two critical rules for efficient training. Recent quantization approaches violates the two rules and results in degenerated convergence. To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training. We also analyze the quantization error introduced in calculating the gradient in the popular parameterized clipping activation (PACT) technique. Through SAT together with gradient-calibrated PACT, quantized models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques