On the role of synaptic stochasticity in training low-precision neural   networks

Carlo Baldassi; Federica Gerace; Hilbert J. Kappen; Carlo Lucibello,; Luca Saglietti; Enzo Tartaglione; Riccardo Zecchina

arXiv:1710.09825·cond-mat.dis-nn·July 4, 2018

On the role of synaptic stochasticity in training low-precision neural networks

Carlo Baldassi, Federica Gerace, Hilbert J. Kappen, Carlo Lucibello,, Luca Saglietti, Enzo Tartaglione, Riccardo Zecchina

PDF

TL;DR

This paper demonstrates that stochastic binary weights in neural networks favor dense, robust solutions with good generalization, contrasting with isolated typical solutions, and introduces a gradient-based method for training such models.

Contribution

It introduces a gradient descent approach for training stochastic binary neural networks, highlighting the importance of synaptic stochasticity in finding robust solutions.

Findings

01

Stochastic binary weights lead to dense, robust solution regions.

02

Typical solutions are isolated and hard to find.

03

The proposed method effectively trains discrete deep neural networks.

Abstract

Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension aimed at training discrete deep neural networks is also investigated.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.