Neural Networks with Few Multiplications

Zhouhan Lin; Matthieu Courbariaux; Roland Memisevic; Yoshua Bengio

arXiv:1510.03009·cs.LG·February 29, 2016·ICLR·155 cites

Neural Networks with Few Multiplications

Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio

PDF

Open Access 2 Repos

TL;DR

This paper introduces a training method for neural networks that reduces multiplications by stochastically binarizing weights and quantizing representations, leading to faster, hardware-efficient training without sacrificing accuracy.

Contribution

The paper presents a novel approach combining stochastic binarization of weights and quantization of activations to eliminate most multiplications during training.

Findings

01

Achieves comparable or better accuracy than standard training on MNIST, CIFAR10, SVHN.

02

Reduces computational complexity, enabling faster, hardware-friendly neural network training.

03

Demonstrates effectiveness across multiple datasets with minimal performance loss.

Abstract

For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these. Our method consists of two parts: First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we quantize the representations at each layer to convert the remaining multiplications into binary shifts. Experimental results across 3 popular datasets (MNIST, CIFAR10, SVHN) show that this approach not only does not hurt classification performance but can result in even better performance than standard stochastic gradient descent training, paving the way to fast, hardware-friendly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Neural Networks and Applications