A Principled Bayesian Framework for Training Binary and Spiking Neural Networks

James A. Walker; Moein Khajehnejad; Adeel Razi

arXiv:2505.17962·cs.LG·May 26, 2025

A Principled Bayesian Framework for Training Binary and Spiking Neural Networks

James A. Walker, Moein Khajehnejad, Adeel Razi

PDF

TL;DR

This paper introduces a Bayesian framework for training binary and spiking neural networks that achieves high performance without normalization layers, using importance-weighted estimators and variational inference.

Contribution

It presents a novel Bayesian approach with importance-weighted estimators for end-to-end training of binary and spiking neural networks, eliminating the need for normalization layers.

Findings

01

Achieves state-of-the-art performance on CIFAR-10, DVS Gesture, and SHD datasets.

02

Enables training of deep residual networks without normalization.

03

Matches or exceeds existing methods without normalization or hand-tuned gradients.

Abstract

We propose a Bayesian framework for training binary and spiking neural networks that achieves state-of-the-art performance without normalisation layers. Unlike commonly used surrogate gradient methods -- often heuristic and sensitive to hyperparameter choices -- our approach is grounded in a probabilistic model of noisy binary networks, enabling fully end-to-end gradient-based optimisation. We introduce importance-weighted straight-through (IW-ST) estimators, a unified class generalising straight-through and relaxation-based estimators. We characterise the bias-variance trade-off in this family and derive a bias-minimising objective implemented via an auxiliary loss. Building on this, we introduce Spiking Bayesian Neural Networks (SBNNs), a variational inference framework that uses posterior noise to train Binary and Spiking Neural Networks with IW-ST. This Bayesian approach minimises…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.