Scaling Laws for Uncertainty in Deep Learning

Mattia Rosso; Simone Rossi; Giulio Franzese; Markus Heinonen; Maurizio Filippone

arXiv:2506.09648·stat.ML·February 10, 2026

Scaling Laws for Uncertainty in Deep Learning

Mattia Rosso, Simone Rossi, Giulio Franzese, Markus Heinonen, Maurizio Filippone

PDF

Open Access 3 Reviews

TL;DR

This paper empirically demonstrates that predictive uncertainties in deep learning models follow scaling laws relative to dataset and model sizes, highlighting the importance of Bayesian methods even with large data.

Contribution

It provides the first empirical evidence of scaling laws governing predictive uncertainty in over-parameterized deep learning models across vision and language tasks.

Findings

01

Scaling laws exist for predictive uncertainty measures.

02

Uncertainty does not diminish sufficiently with large data alone.

03

Bayesian approaches remain valuable despite large datasets.

Abstract

Deep learning has recently revealed the existence of scaling laws, demonstrating that model performance follows predictable trends based on dataset and model sizes. Inspired by these findings and fascinating phenomena emerging in the over-parameterized regime, we examine a parallel direction: do similar scaling laws govern predictive uncertainties in deep learning? In identifiable parametric models, such scaling laws can be derived in a straightforward manner by treating model parameters in a Bayesian way. In this case, for example, we obtain $O (1/ N)$ contraction rates for epistemic uncertainty with respect to the number of data $N$ . However, in over-parameterized models, these guarantees do not hold, leading to largely unexplored behaviors. In this work, we empirically show the existence of scaling laws associated with various measures of predictive uncertainty with respect to dataset…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 3

Strengths

- The paper is well written and organized. - The paper deals with uncertainty quantification in deep learning, which is important for safety critical applications of deep learning models.

Weaknesses

- The paper studies uncertainty scaling but even using the same models and data, the uncertainty quantified using different approaches end up being different and display different characteristics. This makes me wonder how generalizable the results would be, - The paper provides an empirical study so its contribution offers limited advancement over the existing studies. Experiments settings considered are also limited, not capturing common model and data sizes.

Reviewer 02Rating 2Confidence 4

Strengths

1. The paper addresses an important research direction that has been largely overlooked in recent studies. 2. The work presents a comprehensive evaluation across both visual and textual domains, focusing on multiple model sizes, architectures, and training configurations for in-domain and out-of-domain visual tasks. 3. The authors provide a clear and detailed description of the training configurations, ensuring strong reproducibility of the experiments.

Weaknesses

1. The structure of the paper is weak and requires significant improvement. For example, Figure 7 is cited on page 5 but appears on page 7, while Figure 4 appears on page 5 but is cited on page 7, etc. These inconsistencies disrupt the reading flow and make it difficult to follow the paper’s narrative, forcing the reader to constantly scroll back and forth to connect the references with the corresponding figures. Moreover, in some figures, the model is fixed and the types of uncertainty are dist

Reviewer 03Rating 4Confidence 3

Strengths

Strength: 1. sounds interesting as the author claims they provide "strong evidence to dispel recurring skepticism against Bayesian approaches" 2. Authors empirically observed the "power-law" in uncertainty estimation 3. Conducted extensive experiments

Weaknesses

weakness: 1. I don't like "strong evidence to dispel" as the authors concluded their claims from empirical observations, and I think that's far enough to claim "strong evidence" 2. Eq.(9) seems incorrect 3. **The so-called “theoretical connection” holds only for identifiable linear models and offers no quantitative derivation for deep networks** 4. EU doesn’t decrease/near-zero or even positive slope: In multiple plots, the fitted exponent for epistemic uncertainty is effectively zero or positiv

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI)