Training wide residual networks for deployment using a single bit for   each weight

Mark D. McDonnell

arXiv:1802.08530·cs.LG·February 27, 2018·42 cites

Training wide residual networks for deployment using a single bit for each weight

Mark D. McDonnell

PDF

Open Access 5 Repos

TL;DR

This paper demonstrates that deep neural networks can be effectively binarized to use only one bit per weight, achieving near full-precision accuracy on multiple datasets with simplified training methods and significant error rate improvements.

Contribution

The authors introduce a simplified binarization approach for wide residual networks that achieves state-of-the-art accuracy with 1-bit weights, reducing complexity and memory usage.

Findings

01

Achieved 3.9% error on CIFAR-10 with 1-bit weights

02

Error rates halved compared to previous binarization methods on CIFAR

03

Training speed comparable to full-precision networks with improved accuracy

Abstract

For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bit-per-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9%, 18.5% and 26.0% / 8.5% (Top-1 /…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification