AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural   Networks

Huu Le; Rasmus Kj{\ae}r H{\o}ier; Che-Tsung Lin; Christopher Zach

arXiv:2112.02880·cs.LG·December 7, 2021

AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks

Huu Le, Rasmus Kj{\ae}r H{\o}ier, Che-Tsung Lin, Christopher Zach

PDF

Open Access

TL;DR

This paper introduces AdaSTE, an adaptive straight-through estimator for training binary neural networks, which improves upon existing methods by offering a flexible, bilevel optimization-based approach with demonstrated superior performance.

Contribution

The paper presents a novel bilevel optimization framework and an adaptive straight-through estimator for more effective training of binary neural networks.

Findings

01

AdaSTE outperforms existing methods in experiments.

02

The approach offers a simple yet flexible training algorithm.

03

Experimental results show improved accuracy and efficiency.

Abstract

We propose a new algorithm for training deep neural networks (DNNs) with binary weights. In particular, we first cast the problem of training binary neural networks (BiNNs) as a bilevel optimization instance and subsequently construct flexible relaxations of this bilevel program. The resulting training method shares its algorithmic simplicity with several existing approaches to train BiNNs, in particular with the straight-through gradient estimator successfully employed in BinaryConnect and subsequent methods. In fact, our proposed method can be interpreted as an adaptive variant of the original straight-through estimator that conditionally (but not always) acts like a linear mapping in the backward pass of error propagation. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Algorithms · Hydraulic Fracturing and Reservoir Analysis