On the good reliability of an interval-based metric to validate   prediction uncertainty for machine learning regression tasks

Pascal Pernot

arXiv:2408.13089·stat.ML·August 27, 2024

On the good reliability of an interval-based metric to validate prediction uncertainty for machine learning regression tasks

Pascal Pernot

PDF

Open Access 1 Repo

TL;DR

This paper proposes an interval-based metric, PICP, for more reliable validation of prediction uncertainty in machine learning regression, especially in heavy-tailed distributions, outperforming variance-based metrics in speed and reliability.

Contribution

It introduces the PICP metric as a more robust alternative to variance-based calibration metrics for prediction uncertainty validation.

Findings

01

Student's-t distribution models z-scores well

02

Simple 2-sigma rule estimates 95% intervals for ν>3

03

PICP tests more datasets than ZMS

Abstract

This short study presents an opportunistic approach to a (more) reliable validation method for prediction uncertainty average calibration. Considering that variance-based calibration metrics (ZMS, NLL, RCE...) are quite sensitive to the presence of heavy tails in the uncertainty and error distributions, a shift is proposed to an interval-based metric, the Prediction Interval Coverage Probability (PICP). It is shown on a large ensemble of molecular properties datasets that (1) sets of z-scores are well represented by Student's- $t (ν)$ distributions, $ν$ being the number of degrees of freedom; (2) accurate estimation of 95 $%$ prediction intervals can be obtained by the simple $2 σ$ rule for $ν > 3$ ; and (3) the resulting PICPs are more quickly and reliably tested than variance-based calibration metrics. Overall, this method enables to test 20 $%$ more datasets than ZMS testing.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ppernot/2024_picp
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Fault Detection and Control Systems