Direct Bethe Free Energy Minimization for Bayesian Neural Network

Pavel Prochazka

arXiv:2605.08446·cs.LG·May 13, 2026

Direct Bethe Free Energy Minimization for Bayesian Neural Network

Pavel Prochazka

PDF

TL;DR

This paper introduces a novel training method for Bayesian neural networks by directly minimizing the Bethe free energy, enabling efficient joint optimization of weights and hyperparameters without cross-validation.

Contribution

It presents a new Bethe free energy-based training framework that handles probabilistic and deterministic subgraphs, with analytical tractability and empirical Bayes hyperparameter optimization.

Findings

01

Bethe loss equals the exact marginal likelihood for Gaussian likelihoods.

02

The method is competitive with standard approaches on multiple benchmarks.

03

Joint empirical Bayes optimization eliminates the need for outer hyperparameter tuning.

Abstract

We propose training Bayesian neural networks by directly minimizing the Bethe free energy rather than maximizing a variational lower bound. On tree-structured factor graphs the Bethe free energy is exact; deterministic layers drop out of the objective and are trained by standard backpropagation, so the framework accommodates any mixture of probabilistic and deterministic subgraphs without modification. Restricting the weight posterior to a last-layer Gaussian yields analytically tractable losses: for a Gaussian likelihood the Bethe loss equals the exact marginal likelihood, and for a probit likelihood it reduces to a closed form via the probit-Gaussian convolution. Both objectives sit strictly between MAP and the ELBO ( $L_{MAP} \leq L_{Bethe} \leq L_{ELBO}$ ), removing the structural Jensen gap that no choice of variational family can close. The Z-consistent prior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.