Training Multi-Layer Binary Neural Networks With Local Binary Error Signals

Luca Colombo; Fabrizio Pittorino; and Manuel Roveri

arXiv:2412.00119·cs.LG·December 8, 2025

Training Multi-Layer Binary Neural Networks With Local Binary Error Signals

Luca Colombo, Fabrizio Pittorino, and Manuel Roveri

PDF

TL;DR

This paper introduces a fully binary, gradient-free training algorithm for multi-layer Binary Neural Networks that uses local binary error signals, significantly improving accuracy and efficiency over existing methods.

Contribution

It presents the first fully binary, gradient-free training method for multi-layer BNNs using local binary error signals and integer-valued weights, enhancing neurobiological plausibility.

Findings

01

Up to +35.47% accuracy improvement over single-layer state-of-the-art

02

Up to +35.30% accuracy over full-precision SGD at same memory cost

03

Reduces computational cost by 100 to 1000 times

Abstract

Binary Neural Networks (BNNs) significantly reduce computational complexity and memory usage in machine and deep learning by representing weights and activations with just one bit. However, most existing training algorithms for BNNs rely on quantization-aware floating-point Stochastic Gradient Descent (SGD), limiting the full exploitation of binary operations to the inference phase only. In this work, we propose, for the first time, a fully binary and gradient-free training algorithm for multi-layer BNNs, eliminating the need for back-propagated floating-point gradients. Specifically, the proposed algorithm relies on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism, thereby enhancing its neurobiological plausibility. Our proposed solution enables the training of binary multi-layer perceptrons…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Descent