Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles

Andrew Millard; Zheng Zhao; Joshua Murphy; Simon Maskell

arXiv:2505.11671·stat.ML·May 20, 2025

Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles

Andrew Millard, Zheng Zhao, Joshua Murphy, Simon Maskell

PDF

Open Access

TL;DR

This paper presents a scalable SMC method with SGHMC proposals for Bayesian neural networks, improving uncertainty quantification, calibration, and reducing overfitting in various tasks.

Contribution

Introduces SMCSGHMC, a scalable Bayesian sampling method combining SMC and SGHMC for neural networks, enabling efficient mini-batch sampling and better uncertainty estimates.

Findings

01

SMCSGHMC outperforms SGD and deep ensembles in image classification.

02

It reduces overfitting and enhances calibration of neural networks.

03

The method is effective in OOD detection and transfer learning tasks.

Abstract

Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) proposals into SMC, enabling efficient mini-batch based sampling. Our resulting SMCSGHMC algorithm outperforms standard stochastic gradient descent (SGD) and deep ensembles across image classification, out-of-distribution (OOD) detection, and transfer learning tasks. We further show that SMCSGHMC mitigates overfitting and improves calibration, providing a flexible, scalable pathway for converting pretrained neural networks into well-calibrated Bayesian models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Markov Chains and Monte Carlo Methods

MethodsDeep Ensembles