Training wide residual networks for deployment using a single bit for each weight
Mark D. McDonnell

TL;DR
This paper demonstrates that deep neural networks can be effectively binarized to use only one bit per weight, achieving near full-precision accuracy on multiple datasets with simplified training methods and significant error rate improvements.
Contribution
The authors introduce a simplified binarization approach for wide residual networks that achieves state-of-the-art accuracy with 1-bit weights, reducing complexity and memory usage.
Findings
Achieved 3.9% error on CIFAR-10 with 1-bit weights
Error rates halved compared to previous binarization methods on CIFAR
Training speed comparable to full-precision networks with improved accuracy
Abstract
For fast and energy-efficient deployment of trained deep neural networks on resource-constrained embedded hardware, each learned weight parameter should ideally be represented and stored using a single bit. Error-rates usually increase when this requirement is imposed. Here, we report large improvements in error rates on multiple datasets, for deep convolutional neural networks deployed with 1-bit-per-weight. Using wide residual networks as our main baseline, our approach simplifies existing methods that binarize weights by applying the sign function in training; we apply scaling factors for each layer with constant unlearned values equal to the layer-specific standard deviations used for initialization. For CIFAR-10, CIFAR-100 and ImageNet, and models with 1-bit-per-weight requiring less than 10 MB of parameter memory, we achieve error rates of 3.9%, 18.5% and 26.0% / 8.5% (Top-1 /…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
