Provable Uncertainty Decomposition via Higher-Order Calibration
Gustaf Ahdritz, Aravind Gollakota, Parikshit Gopalan, Charlotte Peale,, Udi Wieder

TL;DR
This paper introduces a formal method for decomposing predictive uncertainty into aleatoric and epistemic parts using higher-order calibration, providing guarantees without assumptions on data distribution.
Contribution
It proposes higher-order calibration as a new framework for uncertainty decomposition with formal guarantees and applicability to existing models.
Findings
Guarantees aleatoric uncertainty matches real-world data.
Applicable to Bayesian and ensemble models.
Effective in image classification experiments.
Abstract
We give a principled method for decomposing the predictive uncertainty of a model into aleatoric and epistemic components with explicit semantics relating them to the real-world data distribution. While many works in the literature have proposed such decompositions, they lack the type of formal guarantees we provide. Our method is based on the new notion of higher-order calibration, which generalizes ordinary calibration to the setting of higher-order predictors that predict mixtures over label distributions at every point. We show how to measure as well as achieve higher-order calibration using access to -snapshots, namely examples where each point has independent conditional labels. Under higher-order calibration, the estimated aleatoric uncertainty at a point is guaranteed to match the real-world aleatoric uncertainty averaged over all points where the prediction is made. To…
Peer Reviews
Decision·ICLR 2025 Spotlight
The approach is innovative, and the authors look at calibration of second-order distribution, which is still a shallow-studied subfield. The exposition is mostly clear, and the results they derive are mathematically sound.
See Questions.
The main strengths of the paper are its theoretical contributions. Specifically, it generalizes the notion of calibration by extending it to "higher-order" calibration. The authors show that a higher-order calibrated predictor provides correct estimates of the true AU. Moreover, they show that being higher-order calibrated is the necessary and sufficient condition for producing accurate estimates of AU, which I believe is an important theoretical result. Since in practice we can only have fi
I see several weaknesses in the paper, which I list below. Additionally, there are aspects I did not fully understand (which may not necessarily be weaknesses), and I will list them in the Questions section. ### Structure and Text: I find the structure of the paper somewhat confusing and believe it could be improved. Specifically: - Figure 1 on page 2 is never referenced in the text. The authors might consider referencing it in the paragraph on lines 37-43, where they essentially describe it
1. Presents important and interesting theoretical formalization on higher-order calibration and convergence of approximate k-order calibration to approximate higher-order calibration, preserving the mathematical rigor. 2. The approximate k-snapshot calibration is certainly important in applying the theory in practice. 2. The paper is well written with a comprehensive study on related works which can be very useful for a novice reader. 3. The experiments are well motivated and demonstrate the the
1. In the experiments, it is useful to mention how the estimation of these quantities of interest were done, for example the how the entropy values were computed empirically in order to get an idea on how the estimation errors of those might affect the overall uncertainty estimation. 2. The running example can be discussed in detail in the main text or in the appendix as an aid for understanding the notation and the definitions.
Videos
Taxonomy
TopicsFault Detection and Control Systems · Probabilistic and Robust Engineering Design · Structural Health Monitoring Techniques
