TL;DR
This paper introduces Bop2ndOrder, a second-order optimizer for Binarized Neural Networks that improves convergence speed and accuracy by utilizing both first and second raw moments of gradients, outperforming previous methods.
Contribution
The paper proposes Bop2ndOrder, a novel second-order optimizer for BNNs, with two variants, and demonstrates its effectiveness through extensive hyperparameter and performance evaluations.
Findings
Faster convergence on CIFAR10 and ImageNet datasets.
Robustness to hyperparameter variations.
Achieved higher accuracy than previous BNN optimizers.
Abstract
The optimization of Binary Neural Networks (BNNs) relies on approximating the real-valued weights with their binarized representations. Current techniques for weight-updating use the same approaches as traditional Neural Networks (NNs) with the extra requirement of using an approximation to the derivative of the sign function - as it is the Dirac-Delta function - for back-propagation; thus, efforts are focused adapting full-precision techniques to work on BNNs. In the literature, only one previous effort has tackled the problem of directly training the BNNs with bit-flips by using the first raw moment estimate of the gradients and comparing it against a threshold for deciding when to flip a weight (Bop). In this paper, we take an approach parallel to Adam which also uses the second raw moment estimate to normalize the first raw moment before doing the comparison with the threshold, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFLIP · Adam
