Temperature Optimization for Bayesian Deep Learning

Kenyon Ng; Chris van der Heide; Liam Hodgkinson; Susan Wei

arXiv:2410.05757·stat.ML·October 27, 2025

Temperature Optimization for Bayesian Deep Learning

Kenyon Ng, Chris van der Heide, Liam Hodgkinson, Susan Wei

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a data-driven method for selecting the optimal temperature in Bayesian Deep Learning to improve predictive performance, addressing the lack of systematic approaches beyond grid search.

Contribution

It proposes estimating the temperature as a model parameter directly from data, offering a more efficient alternative to grid search for optimizing Bayesian model predictions.

Findings

01

Method performs comparably to grid search

02

Reduces computational cost significantly

03

Highlights differing views on Cold Posterior Effect

Abstract

The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term `CPE' suggests colder temperatures are inherently better, the BDL community increasingly recognizes that this is not always the case. Despite this, there remains no systematic method for finding the optimal temperature beyond grid search. In this work, we propose a data-driven approach to select the temperature that maximizes test log-predictive density, treating the temperature as a model parameter and estimating it directly from the data. We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost, across both regression and classification tasks. Finally, we highlight the differing perspectives on CPE…

Peer Reviews

Decision·UAI 2025 Poster

Reviewer 01Rating 5Confidence 3

Strengths

* The paper reads very well. * The topic of this paper is highly relevant with increasing focus on uncertainty quantification of neural networks being highly popular. * The discussion on the cold posterior effect is interesting, and provides additional insights and bridges understanding between two communities that often are disjoint.

Weaknesses

I will elaborate on these weaknesses in questions. 1. Possible lack of novelty. 2. Overclaims in experimental results. 3. Lacking metrics in experimental results.

Reviewer 02Rating 3Confidence 3

Strengths

I do think that an approach like the proposed could be interesting. It might a good way to find temperatures for tempered Bayesian models, but there is still active debate whether tempering Bayesian models is actually necessary or whether it is an artifact of poor optimization or data augmentations.

Weaknesses

- Written in a misleading way. The authors introduce theory for choosing the inverse temperature $\beta$ that is not part of the standard Bayesian literature in Section 3.0. They do not use it later on, though. This feels like a tactic to make the paper look more complex than it is. What they do in the end is find the temperature using a simple MLE. Just like one would usually train a neural network just with a tempered prediction distribution, which they derive intuitively from a dataset-depend

Reviewer 03Rating 5Confidence 4

Strengths

The paper is notationally clear and seems solid. The topic is important and of interest to the community. Theories related to the cold posterior effect are discussed in good detail. Additional material in the appendix is illuminating. Source code is published for reproducibity

Weaknesses

My primary critique is that the empirical study is somewhat limited. While the proposed method appears straightforward and effective, its comparative performance with alternative approaches is not entirely clear. For instance, while the paper demonstrates how the tempered posterior can be reframed as an equivalent 'tempered model,' it does not include a comparison with direct Bayesian inference within this model, treating the temperature as a model parameter. Another common technique involves fi

Code & Models

Repositories

weiyaw/tempered-posteriors
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference

MethodsCollaborative Preference Embedding