Deep Neural Network Training without Multiplications

Tsuguo Mogami

arXiv:2012.03458·cs.LG·December 8, 2020

Deep Neural Network Training without Multiplications

Tsuguo Mogami

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multiplication-free method for training deep neural networks by replacing floating-point multiplications with addition operations, achieving comparable accuracy to traditional methods without stability issues.

Contribution

It presents a novel approach that eliminates multiplications in neural network training, maintaining accuracy and stability, which is a significant departure from conventional low-precision techniques.

Findings

01

ResNet trained with addition-based operations achieves competitive accuracy.

02

The method does not require stability correction techniques.

03

In some cases, it matches the baseline FP32 accuracy.

Abstract

Is multiplication really necessary for deep neural networks? Here we propose just adding two IEEE754 floating-point numbers with an integer-add instruction in place of a floating-point multiplication instruction. We show that ResNet can be trained using this operation with competitive classification accuracy. Our proposal did not require any methods to solve instability and decrease in accuracy, which is common in low-precision training. In some settings, we may obtain equal accuracy to the baseline FP32 result. This method will enable eliminating the multiplications in deep neural-network training and inference.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

epfml/piecewise-affine-multiplication
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical Methods and Algorithms · Model Reduction and Neural Networks · Neural Networks and Applications

Methods1x1 Convolution · Convolution · Max Pooling · Kaiming Initialization · *Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection · Bottleneck Residual Block · Average Pooling · Batch Normalization · Global Average Pooling