On the Pitfalls of Heteroscedastic Uncertainty Estimation with Probabilistic Neural Networks
Maximilian Seitzer, Arash Tavakoli, Dimitrije Antic, Georg Martius

TL;DR
This paper critically examines the use of log-likelihood for heteroscedastic uncertainty estimation in neural networks, revealing pitfalls and proposing a weighted alternative that improves robustness and accuracy across various tasks.
Contribution
It identifies issues with traditional log-likelihood loss in heteroscedastic neural networks and introduces the $eta$-NLL method, which enhances stability and performance.
Findings
Log-likelihood can lead to poor, stable parameter estimates.
The $eta$-NLL approach mitigates estimation issues.
Empirical results show improved robustness and accuracy.
Abstract
Capturing aleatoric uncertainty is a critical part of many machine learning systems. In deep learning, a common approach to this end is to train a neural network to estimate the parameters of a heteroscedastic Gaussian distribution by maximizing the logarithm of the likelihood function under the observed data. In this work, we examine this approach and identify potential hazards associated with the use of log-likelihood in conjunction with gradient-based optimizers. First, we present a synthetic example illustrating how this approach can lead to very poor but stable parameter estimates. Second, we identify the culprit to be the log-likelihood loss, along with certain conditions that exacerbate the issue. Third, we present an alternative formulation, termed -NLL, in which each data point's contribution to the loss is weighted by the -exponentiated variance estimate. We show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
