Temperature Optimization for Bayesian Deep Learning
Kenyon Ng, Chris van der Heide, Liam Hodgkinson, Susan Wei

TL;DR
This paper introduces a data-driven method for selecting the optimal temperature in Bayesian Deep Learning to improve predictive performance, addressing the lack of systematic approaches beyond grid search.
Contribution
It proposes estimating the temperature as a model parameter directly from data, offering a more efficient alternative to grid search for optimizing Bayesian model predictions.
Findings
Method performs comparably to grid search
Reduces computational cost significantly
Highlights differing views on Cold Posterior Effect
Abstract
The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term `CPE' suggests colder temperatures are inherently better, the BDL community increasingly recognizes that this is not always the case. Despite this, there remains no systematic method for finding the optimal temperature beyond grid search. In this work, we propose a data-driven approach to select the temperature that maximizes test log-predictive density, treating the temperature as a model parameter and estimating it directly from the data. We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost, across both regression and classification tasks. Finally, we highlight the differing perspectives on CPE…
Peer Reviews
Decision·UAI 2025 Poster
* The paper reads very well. * The topic of this paper is highly relevant with increasing focus on uncertainty quantification of neural networks being highly popular. * The discussion on the cold posterior effect is interesting, and provides additional insights and bridges understanding between two communities that often are disjoint.
I will elaborate on these weaknesses in questions. 1. Possible lack of novelty. 2. Overclaims in experimental results. 3. Lacking metrics in experimental results.
I do think that an approach like the proposed could be interesting. It might a good way to find temperatures for tempered Bayesian models, but there is still active debate whether tempering Bayesian models is actually necessary or whether it is an artifact of poor optimization or data augmentations.
- Written in a misleading way. The authors introduce theory for choosing the inverse temperature $\beta$ that is not part of the standard Bayesian literature in Section 3.0. They do not use it later on, though. This feels like a tactic to make the paper look more complex than it is. What they do in the end is find the temperature using a simple MLE. Just like one would usually train a neural network just with a tempered prediction distribution, which they derive intuitively from a dataset-depend
The paper is notationally clear and seems solid. The topic is important and of interest to the community. Theories related to the cold posterior effect are discussed in good detail. Additional material in the appendix is illuminating. Source code is published for reproducibity
My primary critique is that the empirical study is somewhat limited. While the proposed method appears straightforward and effective, its comparative performance with alternative approaches is not entirely clear. For instance, while the paper demonstrates how the tempered posterior can be reframed as an equivalent 'tempered model,' it does not include a comparison with direct Bayesian inference within this model, treating the temperature as a model parameter. Another common technique involves fi
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference
MethodsCollaborative Preference Embedding
