Stochastic Weight Sharing for Bayesian Neural Networks

Moule Lin; Shuhao Guan; Weipeng Jing; Goetz Botterweck; Andrea Patane

arXiv:2505.17856·cs.LG·May 26, 2025

Stochastic Weight Sharing for Bayesian Neural Networks

Moule Lin, Shuhao Guan, Weipeng Jing, Goetz Botterweck, Andrea Patane

PDF

TL;DR

This paper introduces a stochastic weight sharing method for Bayesian Neural Networks that significantly reduces computational costs and model size while maintaining high accuracy and uncertainty estimation quality.

Contribution

It reinterprets weight-sharing quantization techniques using a stochastic framework with Gaussian distributions, enabling efficient training of large-scale BNNs.

Findings

01

Reduces computational overhead by several orders of magnitude.

02

Compresses model parameters by approximately 50x and size by 75.

03

Achieves comparable accuracy and uncertainty estimation to state-of-the-art methods.

Abstract

While offering a principled framework for uncertainty quantification in deep learning, the employment of Bayesian Neural Networks (BNNs) is still constrained by their increased computational requirements and the convergence difficulties when training very deep, state-of-the-art architectures. In this work, we reinterpret weight-sharing quantization techniques from a stochastic perspective in the context of training and inference with Bayesian Neural Networks (BNNs). Specifically, we leverage 2D adaptive Gaussian distributions, Wasserstein distance estimations, and alpha blending to encode the stochastic behaviour of a BNN in a lower dimensional, soft Gaussian representation. Through extensive empirical investigation, we demonstrate that our approach significantly reduces the computational overhead inherent in Bayesian learning by several orders of magnitude, enabling the efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.