On Joint Regularization and Calibration in Deep Ensembles

Laurits Fredsgaard; Mikkel N. Schmidt

arXiv:2511.04160·cs.LG·November 10, 2025

On Joint Regularization and Calibration in Deep Ensembles

Laurits Fredsgaard, Mikkel N. Schmidt

PDF

Open Access

TL;DR

This paper explores how joint tuning of hyperparameters in deep ensembles can enhance performance and calibration, proposing a practical holdout strategy to balance data use and evaluation.

Contribution

It introduces a partially overlapping holdout method and demonstrates the benefits of joint tuning of weight decay, temperature, and early stopping in deep ensembles.

Findings

01

Joint tuning often improves performance and calibration.

02

Overlapping holdout balances evaluation and training data.

03

Effectiveness varies across tasks and metrics.

Abstract

Deep ensembles are a powerful tool in machine learning, improving both model performance and uncertainty calibration. While ensembles are typically formed by training and tuning models individually, evidence suggests that jointly tuning the ensemble can lead to better performance. This paper investigates the impact of jointly tuning weight decay, temperature scaling, and early stopping on both predictive performance and uncertainty quantification. Additionally, we propose a partially overlapping holdout strategy as a practical compromise between enabling joint evaluation and maximizing the use of data for training. Our results demonstrate that jointly tuning the ensemble generally matches or improves performance, with significant variation in effect size across different tasks and metrics. We highlight the trade-offs between individual and joint optimization in deep ensemble training,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Advanced Neural Network Applications