Flat Seeking Bayesian Neural Networks
Van-Anh Nguyen, Tung-Long Vuong, Hoang Phan, Thanh-Toan Do, Dinh, Phung, Trung Le

TL;DR
This paper introduces a sharpness-aware Bayesian posterior for neural networks, leading to flatter models with improved generalization, demonstrated through experiments with state-of-the-art Bayesian Neural Networks.
Contribution
It develops a novel sharpness-aware Bayesian inference framework that encourages flatter, better-generalizing neural network models, addressing limitations of traditional posterior formulations.
Findings
Sharpness-aware posterior models exhibit better flatness.
Flat-seeking Bayesian Neural Networks outperform baselines in all metrics.
The approach enhances model generalization through improved flatness.
Abstract
Bayesian Neural Networks (BNNs) provide a probabilistic interpretation for deep learning models by imposing a prior distribution over model parameters and inferring a posterior distribution based on observed data. The model sampled from the posterior distribution can be used for providing ensemble predictions and quantifying prediction uncertainty. It is well-known that deep learning models with lower sharpness have better generalization ability. However, existing posterior inferences are not aware of sharpness/flatness in terms of formulation, possibly leading to high sharpness for the models sampled from them. In this paper, we develop theories, the Bayesian setting, and the variational inference approach for the sharpness-aware posterior. Specifically, the models sampled from our sharpness-aware posterior, and the optimal approximate posterior estimating this sharpness-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
MethodsVariational Inference · Attentive Walk-Aggregating Graph Neural Network
