Comparing the quality of neural network uncertainty estimates for   classification problems

Daniel Ries; Joshua Michalenko; Tyler Ganter; Rashad Imad-Fayez; Baiyasi; Jason Adams

arXiv:2308.05903·cs.LG·August 14, 2023

Comparing the quality of neural network uncertainty estimates for classification problems

Daniel Ries, Joshua Michalenko, Tyler Ganter, Rashad Imad-Fayez, Baiyasi, Jason Adams

PDF

Open Access

TL;DR

This paper evaluates and compares various uncertainty quantification methods for deep learning classifiers, highlighting the inconsistency among methods and emphasizing the need for principled quality assessment metrics.

Contribution

It introduces a framework for evaluating UQ methods in deep learning, comparing multiple approaches using statistical metrics on real and simulated data.

Findings

01

MCMC Bayesian neural networks perform best overall.

02

Bootstrapped neural networks are a close second in quality.

03

Different UQ methods can produce markedly different uncertainty estimates.

Abstract

Traditional deep learning (DL) models are powerful classifiers, but many approaches do not provide uncertainties for their estimates. Uncertainty quantification (UQ) methods for DL models have received increased attention in the literature due to their usefulness in decision making, particularly for high-consequence decisions. However, there has been little research done on how to evaluate the quality of such methods. We use statistical methods of frequentist interval coverage and interval width to evaluate the quality of credible intervals, and expected calibration error to evaluate classification predicted confidence. These metrics are evaluated on Bayesian neural networks (BNN) fit using Markov Chain Monte Carlo (MCMC) and variational inference (VI), bootstrapped neural networks (NN), Deep Ensembles (DE), and Monte Carlo (MC) dropout. We apply these different UQ for DL methods to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Fault Detection and Control Systems · Advanced Statistical Methods and Models

MethodsVariational Inference · Deep Ensembles