Theoretical characterization of uncertainty in high-dimensional linear   classification

Lucas Clart\'e; Bruno Loureiro; Florent Krzakala; Lenka Zdeborov\'a

arXiv:2202.03295·cs.LG·September 12, 2023·1 cites

Theoretical characterization of uncertainty in high-dimensional linear classification

Lucas Clart\'e, Bruno Loureiro, Florent Krzakala, Lenka Zdeborov\'a

PDF

Open Access 1 Repo

TL;DR

This paper provides a theoretical analysis of uncertainty in high-dimensional linear classification, deriving formulas for Bayesian uncertainty and classifier calibration, especially in limited data scenarios, using approximate message passing.

Contribution

It introduces a closed-form formula linking Bayesian uncertainty, classifier predictions, and ground-truth uncertainty in high-dimensional Gaussian data, advancing understanding of model calibration.

Findings

01

Bayesian uncertainty can be approximated via AMP in high dimensions.

02

The derived formulas enable analysis of classifier calibration and over-confidence.

03

Regularization can mitigate over-confidence in limited data settings.

Abstract

Being able to reliably assess not only the \emph{accuracy} but also the \emph{uncertainty} of models' predictions is an important endeavour in modern machine learning. Even if the model generating the data and labels is known, computing the intrinsic uncertainty after learning the model from a limited number of samples amounts to sampling the corresponding posterior probability measure. Such sampling is computationally challenging in high-dimensional problems and theoretical results on heuristic uncertainty estimators in high-dimensions are thus scarce. In this manuscript, we characterise uncertainty for learning from limited number of samples of high-dimensional Gaussian input data and labels generated by the probit model. In this setting, the Bayesian uncertainty (i.e. the posterior marginals) can be asymptotically obtained by the approximate message passing algorithm, bypassing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lclarte/uncertainty-project
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Algorithms · Machine Learning and Data Classification