Binarized Neural Networks
Itay Hubara, Daniel Soudry, Ran El Yaniv

TL;DR
This paper presents a method for training Binarized Neural Networks with binary weights and activations, leading to significant reductions in memory and computation, and demonstrates practical speedups on GPU without accuracy loss.
Contribution
The paper introduces a training method for Binarized Neural Networks and provides optimized GPU kernels, achieving high speed and efficiency while maintaining accuracy.
Findings
BNNs achieve nearly state-of-the-art accuracy on MNIST, CIFAR-10, and SVHN.
Binary matrix multiplication GPU kernel runs 7 times faster than unoptimized GPU kernel.
BNNs drastically reduce memory and power consumption during inference.
Abstract
We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters' gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Neural Networks and Applications
