Training deep neural networks with low precision multiplications
Matthieu Courbariaux, Yoshua Bengio, Jean-Pierre David

TL;DR
This paper demonstrates that deep neural networks can be effectively trained using very low precision multiplications, such as 10-bit, reducing hardware complexity without significantly affecting accuracy.
Contribution
It provides empirical evidence that low-precision multiplications are sufficient for training state-of-the-art neural networks across multiple datasets.
Findings
Training with 10-bit multiplications is feasible.
Low precision does not significantly degrade accuracy.
Different formats (floating, fixed, dynamic fixed) perform similarly.
Abstract
Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those formats, we assess the impact of the precision of the multiplications on the final error after training. We find that very low precision is sufficient not just for running trained networks but also for training them. For example, it is possible to train Maxout networks with 10 bits multiplications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Neural Networks and Applications · Advanced Neural Network Applications
MethodsMaxout
