Scalable Bayesian Monte Carlo: fast uncertainty estimation beyond deep ensembles
Xinzhu Liang, Joseph M. Lukens, Sanjaya Lohani, Brian T. Kirby, Thomas A. Searles, Xin Qiu, Kody J. H. Law

TL;DR
This paper presents SBMC, a scalable Bayesian Monte Carlo method that offers fast, accurate uncertainty estimation in deep learning, outperforming state-of-the-art methods like deep ensembles in reliability and epistemic uncertainty quantification.
Contribution
Introduces SBMC, a scalable Bayesian Monte Carlo approach combining a new model and parallel algorithms for improved uncertainty estimation in deep learning.
Findings
SBMC achieves comparable or better accuracy than deep ensembles.
SBMC provides substantially improved epistemic uncertainty quantification.
SBMC enables reliable confidence estimation for predictions.
Abstract
This work introduces a new method designed for Bayesian deep learning called scalable Bayesian Monte Carlo (SBMC). The method is comprised of a model and an algorithm. The model interpolates between a point estimator and the posterior. The algorithm is a parallel implementation of sequential Monte Carlo sampler (SMC) or Markov chain Monte Carlo (MCMC). We collectively refer to these consistent (asymptotically unbiased) algorithms as Bayesian Monte Carlo (BMC), and any such algorithm can be used in our SBMC method. The utility of the method is demonstrated on practical examples: MNIST, CIFAR, IMDb. A systematic numerical study reveals that for the same wall-clock time as state-of-the-art (SOTA) methods like deep ensembles (DE), SBMC achieves comparable or better accuracy and substantially improved uncertainty quantification (UQ)--in particular, epistemic UQ. This…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper is well-written, and the empirical results are presented clearly. Below are the strengths of the presented method: 1. SBMC introduces a model approximation which is anchored posterior that uses a scalar interpolation parameter to tune the trade-off between the fast Maximum A Posteriori estimator and the full Bayesian posterior. 2. The method is highly scalable due to the parallel implementation of consistent Bayesian Monte Carlo algorithms and achieved near linear speed-up. 3. Empir
I think the paper can be improved by addressing the following weak points: 1. The paper relies on empirical tuning for the scalar interpolation parameter s and provides no theoretical analysis linking s to the difference anchored posterior and true posteriors. 2. SBMC requires computing a good Maximum A Posteriori to act as the anchor before parallel Bayesian monte Carlo sampling and it is seen in the MNIST7 experiment that the total cost is roughly doubled compared deep ensembles method due t
1. **Practical Motivation and Design:** The paper addresses a clear problem: scaling Bayesian inference to deep nets to obtain well-calibrated uncertainty. The SBMC model is intuitively motivated as interpolating between a cheap point estimate and the full posterior. This interpolation idea is simple and flexible: by tuning $s$ one can trade off bias vs. sampling difficulty. The algorithmic idea of running multiple short MCMC/SMC chains in parallel is straightforward and leverages modern paralle
1. **Limited Novelty / Relation to Prior Methods:** The core idea (anchoring the posterior at a point estimate and running multiple short MCMC chains) is conceptually similar to known techniques. The paper cites Randomized Maximum Likelihood (RML) and ensemble anchoring methods (e.g. Gu & Oliver 2007; Bardsley et al. 2014). While SBMC’s scalar interpolation $s$ is a convenient formalism, the idea of tempering/anchoring (even mentioning “cold posteriors”) is not fundamentally new. Indeed, Paulin
* **Simple concept of prior anchoring:** The core and simple modeling idea of interpolating between a point estimate and a posterior approximation is conceptually interesting. The fact that this framework is sampler-agnostic is a definite plus, allowing for flexibility in its implementation. I also liked the differentiation from the tempering-induced interpolation in the literature concerned with Cold Posteriors. * **Extensive Empirical Evaluation:** A significant amount of work has clearly gone
* **Concerns Regarding Novelty:** The paper's primary weakness is its limited novelty, particularly on the algorithmic side. As far as I understand it, the proposed algorithm is essentially a parallel MCMC or SMC sampler initialized near a MAP estimate and with slightly altered prior. This is not a new concept; many old (e.g. [1]) and also recent works, including some cited by the authors in Section 4 (e.g., SMS-UBU, Bayesian Deep Ensembles), have explored similar ideas. The paper fails to provi
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Target Tracking and Data Fusion in Sensor Networks
MethodsDeep Ensembles
