Post Training Uncertainty Calibration of Deep Networks For Medical Image   Segmentation

Axel-Jan Rousseau; Thijs Becker; Jeroen Bertels; Matthew B. Blaschko,; Dirk Valkenborg

arXiv:2010.14290·eess.IV·October 28, 2020·ISBI

Post Training Uncertainty Calibration of Deep Networks For Medical Image Segmentation

Axel-Jan Rousseau, Thijs Becker, Jeroen Bertels, Matthew B. Blaschko,, Dirk Valkenborg

PDF

1 Repo

TL;DR

This paper evaluates post-training calibration methods for deep neural networks in medical image segmentation, showing they can improve confidence score calibration and are competitive with MC dropout, with varied results across methods.

Contribution

It introduces and compares several straightforward post hoc calibration techniques, including novel methods, for neural networks trained with different loss functions in medical segmentation.

Findings

01

Post hoc calibration improves confidence scores in segmentation models.

02

Models trained with soft Dice loss are not necessarily less calibrated than those trained with cross-entropy.

03

Calibration methods are competitive with MC dropout, but subject-level variance remains similar.

Abstract

Neural networks for automated image segmentation are typically trained to achieve maximum accuracy, while less attention has been given to the calibration of their confidence scores. However, well-calibrated confidence scores provide valuable information towards the user. We investigate several post hoc calibration methods that are straightforward to implement, some of which are novel. They are compared to Monte Carlo (MC) dropout. They are applied to neural networks trained with cross-entropy (CE) and soft Dice (SD) losses on BraTS 2018 and ISLES 2018. Surprisingly, models trained on SD loss are not necessarily less calibrated than those trained on CE loss. In all cases, at least one post hoc method improves the calibration. There is limited consistency across the results, so we can't conclude on one method being superior. In all cases, post hoc calibration is competitive with MC…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AxelJanRousseau/PostTrainCalibration
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.