On Cold Posteriors of Probabilistic Neural Networks: Understanding the Cold Posterior Effect and A New Way to Learn Cold Posteriors with Tight Generalization Guarantees
Yijie Zhang

TL;DR
This paper investigates the cold posterior effect in Bayesian neural networks, providing theoretical insights and a new learning method with strong generalization guarantees to improve uncertainty quantification.
Contribution
It offers a novel understanding of the cold posterior effect and introduces a new approach to learn cold posteriors with tight PAC-Bayesian generalization bounds.
Findings
Cold posteriors can improve predictive performance in Bayesian neural networks.
The proposed method achieves better generalization guarantees.
Theoretical analysis clarifies the role of temperature in Bayesian inference.
Abstract
Bayesian inference provides a principled probabilistic framework for quantifying uncertainty by updating beliefs based on prior knowledge and observed data through Bayes' theorem. In Bayesian deep learning, neural network weights are treated as random variables with prior distributions, allowing for a probabilistic interpretation and quantification of predictive uncertainty. However, Bayesian methods lack theoretical generalization guarantees for unseen data. PAC-Bayesian analysis addresses this limitation by offering a frequentist framework to derive generalization bounds for randomized predictors, thereby certifying the reliability of Bayesian methods in machine learning. Temperature , or inverse-temperature , originally from statistical mechanics in physics, naturally arises in various areas of statistical inference, including Bayesian inference and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Adversarial Robustness in Machine Learning
