Confidence Calibration and Predictive Uncertainty Estimation for Deep   Medical Image Segmentation

Alireza Mehrtash; William M. Wells III; Clare M. Tempany; Purang; Abolmaesumi; Tina Kapur

arXiv:1911.13273·eess.IV·July 6, 2020

Confidence Calibration and Predictive Uncertainty Estimation for Deep Medical Image Segmentation

Alireza Mehrtash, William M. Wells III, Clare M. Tempany, Purang, Abolmaesumi, Tina Kapur

PDF

TL;DR

This paper investigates the calibration of deep neural networks for medical image segmentation, comparing loss functions, proposing ensembling for better confidence estimates, and evaluating out-of-distribution detection across multiple medical imaging tasks.

Contribution

It introduces a systematic comparison of loss functions, proposes model ensembling for confidence calibration, and evaluates uncertainty estimation and out-of-distribution detection in medical segmentation.

Findings

01

Ensembling improves confidence calibration.

02

Dice loss affects uncertainty estimation differently than cross entropy.

03

Calibrated models better predict segmentation quality and detect OOD examples.

Abstract

Fully convolutional neural networks (FCNs), and in particular U-Nets, have achieved state-of-the-art results in semantic segmentation for numerous medical imaging applications. Moreover, batch normalization and Dice loss have been used successfully to stabilize and accelerate training. However, these networks are poorly calibrated i.e. they tend to produce overconfident predictions both in correct and erroneous classifications, making them unreliable and hard to interpret. In this paper, we study predictive uncertainty estimation in FCNs for medical image segmentation. We make the following contributions: 1) We systematically compare cross entropy loss with Dice loss in terms of segmentation quality and uncertainty estimation of FCNs; 2) We propose model ensembling for confidence calibration of the FCNs trained with batch normalization and Dice loss; 3) We assess the ability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · Dice Loss · Batch Normalization