Accurate Neural Training with 4-bit Matrix Multiplications at Standard   Formats

Brian Chmiel; Ron Banner; Elad Hoffer; Hilla Ben Yaacov; Daniel Soudry

arXiv:2112.10769·cs.LG·June 11, 2024·6 cites

Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats

Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel 4-bit quantization method for neural network training that combines unbiased and logarithmic quantization, enabling efficient training with minimal accuracy loss.

Contribution

The authors propose the logarithmic unbiased quantization (LUQ) method, which effectively quantizes both forward and backward passes to 4-bit, improving training accuracy and efficiency.

Findings

01

Achieved 1.1% degradation on ResNet50 with 4-bit training.

02

Reduced degradation to 0.32% after fine-tuning.

03

State-of-the-art results in 4-bit neural network training.

Abstract

Quantization of the weights and activations is one of the main methods to reduce the computational footprint of Deep Neural Networks (DNNs) training. Current methods enable 4-bit quantization of the forward phase. However, this constitutes only a third of the training process. Reducing the computational footprint of the entire training process requires the quantization of the neural gradients, i.e., the loss gradients with respect to the outputs of intermediate neural layers. Previous works separately showed that accurate 4-bit quantization of the neural gradients needs to (1) be unbiased and (2) have a log scale. However, no previous work aimed to combine both ideas, as we do in this work. Specifically, we examine the importance of having unbiased quantization in quantized neural network training, where to maintain it, and how to combine it with logarithmic quantization. Based on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats· slideslive

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning