A Novel Framework for Uncertainty Quantification via Proper Scores for Classification and Beyond
Sebastian G. Gruber

TL;DR
This thesis introduces a comprehensive framework for uncertainty quantification in machine learning based on proper scores, applicable across various tasks, with theoretical insights and practical evaluations including large language models and generative models.
Contribution
It develops a general bias-variance decomposition for proper scores, introduces new calibration error estimators, and applies kernel scores for evaluating generative models across domains.
Findings
Kernel score effectively evaluates generative models in multiple domains.
Proposed uncertainty estimation method outperforms existing baselines for large language models.
New calibration error estimators provide more accurate and interpretable calibration assessments.
Abstract
In this PhD thesis, we propose a novel framework for uncertainty quantification in machine learning, which is based on proper scores. Uncertainty quantification is an important cornerstone for trustworthy and reliable machine learning applications in practice. Usually, approaches to uncertainty quantification are problem-specific, and solutions and insights cannot be readily transferred from one task to another. Proper scores are loss functions minimized by predicting the target distribution. Due to their very general definition, proper scores apply to regression, classification, or even generative modeling tasks. We contribute several theoretical results, that connect epistemic uncertainty, aleatoric uncertainty, and model calibration with proper scores, resulting in a general and widely applicable framework. We achieve this by introducing a general bias-variance decomposition for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
