Exact Backpropagation in Binary Weighted Networks with Group Weight Transformations
Yaniv Shulman

TL;DR
This paper introduces a differentiable method for training binary weighted neural networks using backpropagation, enabling efficient model compression without significant modifications to existing architectures.
Contribution
It presents a novel deterministic, differentiable transformation for binary weights that allows standard backpropagation to optimize binary networks effectively.
Findings
Binary networks can be trained with standard optimization techniques.
The method achieves competitive performance on image classification tasks.
Training overhead is minimal and compatible with existing architectures.
Abstract
Quantization based model compression serves as high performing and fast approach for inference that yields models which are highly compressed when compared to their full-precision floating point counterparts. The most extreme quantization is a 1-bit representation of parameters such that they have only two possible values, typically -1(0) or +1, enabling efficient implementation of the ubiquitous dot product using only additions. The main contribution of this work is the introduction of a method to smooth the combinatorial problem of determining a binary vector of weights to minimize the expected loss for a given objective by means of empirical risk minimization with backpropagation. This is achieved by approximating a multivariate binary state over the weights utilizing a deterministic and differentiable transformation of real-valued, continuous parameters. The proposed method adds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image and Signal Denoising Methods · Sparse and Compressive Sensing Techniques
MethodsStochastic Gradient Descent
