Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer
Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen,, Fengxiang He, Dacheng Tao

TL;DR
This paper introduces STF-BNN, a scalable Bayesian neural network approach that efficiently scales to large models by Bayesianizing only the first layer, leading to improved performance, robustness, and reduced training costs.
Contribution
The paper proposes a novel spatial-temporal-fusion BNN method that selectively Bayesianizes the first layer, enabling scalable and efficient Bayesian neural networks with theoretical guarantees.
Findings
Achieves state-of-the-art prediction and uncertainty quantification.
Enhances adversarial robustness and privacy preservation.
Reduces training time and memory costs significantly.
Abstract
Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters. In this paper, we discover that the first layer of a deep network possesses multiple disparate optima when solely retrained. This indicates a large posterior variance when the first layer is altered by a Bayesian layer, which motivates us to design a spatial-temporal-fusion BNN (STF-BNN) for efficiently scaling BNNs to large models: (1) first normally train a neural network from scratch to realize fast training; and (2) the first layer is converted to Bayesian and inferred by employing stochastic variational inference, while other layers are fixed. Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
