AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks
Huu Le, Rasmus Kj{\ae}r H{\o}ier, Che-Tsung Lin, Christopher Zach

TL;DR
This paper introduces AdaSTE, an adaptive straight-through estimator for training binary neural networks, which improves upon existing methods by offering a flexible, bilevel optimization-based approach with demonstrated superior performance.
Contribution
The paper presents a novel bilevel optimization framework and an adaptive straight-through estimator for more effective training of binary neural networks.
Findings
AdaSTE outperforms existing methods in experiments.
The approach offers a simple yet flexible training algorithm.
Experimental results show improved accuracy and efficiency.
Abstract
We propose a new algorithm for training deep neural networks (DNNs) with binary weights. In particular, we first cast the problem of training binary neural networks (BiNNs) as a bilevel optimization instance and subsequently construct flexible relaxations of this bilevel program. The resulting training method shares its algorithmic simplicity with several existing approaches to train BiNNs, in particular with the straight-through gradient estimator successfully employed in BinaryConnect and subsequent methods. In fact, our proposed method can be interpreted as an adaptive variant of the original straight-through estimator that conditionally (but not always) acts like a linear mapping in the backward pass of error propagation. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Hydraulic Fracturing and Reservoir Analysis
