Efficient Credal Prediction through Decalibration
Paul Hofman, Timo L\"ohr, Maximilian Muschalik, Yusuf Sale, Eyke H\"ullermeier

TL;DR
This paper introduces an efficient credal prediction method that uses decalibration to produce probability intervals, enabling uncertainty estimation in complex models like foundation models and multi-modal systems.
Contribution
The authors propose a novel decalibration technique for credal prediction that reduces computational complexity, making it feasible for large and complex models.
Findings
Achieves strong performance in coverage and out-of-distribution detection
Enables credal prediction on models like TabPFN and CLIP
Reduces computational cost compared to traditional credal set methods
Abstract
A reliable representation of uncertainty is essential for the application of modern machine learning methods in safety-critical settings. In this regard, the use of credal sets (i.e., convex sets of probability distributions) has recently been proposed as a suitable approach to representing epistemic uncertainty. However, as with other approaches to epistemic uncertainty, training credal predictors is computationally complex and usually involves (re-)training an ensemble of models. The resulting computational complexity prevents their adoption for complex models such as foundation models and multi-modal systems. To address this problem, we propose an efficient method for credal prediction that is grounded in the notion of relative likelihood and inspired by techniques for the calibration of probabilistic classifiers. For each class label, our method predicts a range of plausible…
Peer Reviews
Decision·ICLR 2026 Poster
• Originality: The paper presents a novel post-hoc approach to credal prediction — decalibration — that eliminates the need for retraining or ensemble-based inference, which have been the dominant approaches in credal and epistemic uncertainty estimation. The idea of adjusting logits within a relative-likelihood constraint is both elegant and conceptually original, bridging Bayesian epistemic reasoning with practical optimization. • Technical Quality: The theoretical exposition is mathematicall
• Limited Theoretical Depth Beyond Convexity: While the convexity and boundedness of the credal sets are clearly demonstrated, the paper lacks deeper theoretical guarantees. For instance, there are no formal proofs of coverage calibration, robustness under data shift, or asymptotic optimality compared to Bayesian posteriors. Suggestion: Strengthen the theoretical contribution by connecting decalibration to known uncertainty frameworks such as PAC-Bayesian bounds, conformal coverage guarantees, o
- The proposed method enables uncertainty estimation and cautious inference in large pretrained models like CLIP and TabFPN without expensive retraining & finetuning - The proposed method is straightforward to implement yet quite effective (as demonstrated by the empirical results) - The authors introduce credal spider plots to visualize credal sets represented as box intervals
- While the method does not require retraining, it does require the original training data or an appropriate surrogate to calculate the relative likelihood - The presentation is unclear in places, e.g., while the credal spider plots are quite informative, a full explanation about what they represent is presented in the appendix, which makes earlier references to them (e.g., figure 1) unclear. Section 4 (Empirical results) presents a lot of information without emphasizing key parts like research
The scope of the problem considered - model-agnostic credal predictions - is sizeable and will be of interest to a wide community. Additionally, the post-hoc approach that does not require any retraining as proposed in this paper will encourage its adoption as an added post-training step that can be used to quantify model's epistemic uncertainty. The proposed approach itself, to the best of my knowledge, is sound and the theoretical results seem reasonable, if not unsurprising. The experimental
Credal predictions are particularly useful in data-scarce and safety-critical domains such as healthcare where the lack of data can lead to higher epistemic uncertainty and understanding the plausible range of model predictions can help avoid catastrophic decisions. In this regard, the motivation behind the need for computationally efficient credal predictions is not very compelling. In the same vein, while the authors note such safety-critical domains in their introduction, the experimental eva
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bayesian Modeling and Causal Inference · Explainable Artificial Intelligence (XAI)
