Toward INT4 Fixed-Point Training via Exploring Quantization Error for   Gradients

Dohyung Kim; Junghyup Lee; Jeimin Jeon; Jaehyeon Moon; Bumsub Ham

arXiv:2407.12637·cs.CV·July 18, 2024

Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients

Dohyung Kim, Junghyup Lee, Jeimin Jeon, Jaehyeon Moon, Bumsub Ham

PDF

Open Access

TL;DR

This paper proposes a novel method for low-bit fixed-point gradient quantization in neural network training, focusing on minimizing quantization error for large gradients to improve training efficiency and accuracy.

Contribution

It introduces an analysis of gradient quantization error and an adaptive interval update algorithm to optimize quantization for large gradients during training.

Findings

01

Significant performance improvement across multiple tasks.

02

Effective quantization for various network architectures.

03

Enhanced training stability with low-bit gradients.

Abstract

Network quantization generally converts full-precision weights and/or activations into low-bit fixed-point values in order to accelerate an inference process. Recent approaches to network quantization further discretize the gradients into low-bit fixed-point values, enabling an efficient training. They typically set a quantization interval using a min-max range of the gradients or adjust the interval such that the quantization error for entire gradients is minimized. In this paper, we analyze the quantization error of gradients for the low-bit fixed-point training, and show that lowering the error for large-magnitude gradients boosts the quantization performance significantly. Based on this, we derive an upper bound of quantization error for the large gradients in terms of the quantization interval, and obtain an optimal condition for the interval minimizing the quantization error for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSparse Evolutionary Training