On Cold Posteriors of Probabilistic Neural Networks: Understanding the   Cold Posterior Effect and A New Way to Learn Cold Posteriors with Tight   Generalization Guarantees

Yijie Zhang

arXiv:2410.15310·cs.LG·October 22, 2024

On Cold Posteriors of Probabilistic Neural Networks: Understanding the Cold Posterior Effect and A New Way to Learn Cold Posteriors with Tight Generalization Guarantees

Yijie Zhang

PDF

Open Access 1 Repo

TL;DR

This paper investigates the cold posterior effect in Bayesian neural networks, providing theoretical insights and a new learning method with strong generalization guarantees to improve uncertainty quantification.

Contribution

It offers a novel understanding of the cold posterior effect and introduces a new approach to learn cold posteriors with tight PAC-Bayesian generalization bounds.

Findings

01

Cold posteriors can improve predictive performance in Bayesian neural networks.

02

The proposed method achieves better generalization guarantees.

03

Theoretical analysis clarifies the role of temperature in Bayesian inference.

Abstract

Bayesian inference provides a principled probabilistic framework for quantifying uncertainty by updating beliefs based on prior knowledge and observed data through Bayes' theorem. In Bayesian deep learning, neural network weights are treated as random variables with prior distributions, allowing for a probabilistic interpretation and quantification of predictive uncertainty. However, Bayesian methods lack theoretical generalization guarantees for unseen data. PAC-Bayesian analysis addresses this limitation by offering a frequentist framework to derive generalization bounds for randomized predictors, thereby certifying the reliability of Bayesian methods in machine learning. Temperature $T$ , or inverse-temperature $λ = \frac{1}{T}$ , originally from statistical mechanics in physics, naturally arises in various areas of statistical inference, including Bayesian inference and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pyijiezhang/cpe-underfit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Adversarial Robustness in Machine Learning