How precise are performance estimates for typical medical image   segmentation tasks?

Rosana El Jurdi; Olivier Colliot

arXiv:2210.14677·cs.CV·May 25, 2023

How precise are performance estimates for typical medical image segmentation tasks?

Rosana El Jurdi, Olivier Colliot

PDF

Open Access

TL;DR

This study evaluates the typical confidence in performance estimates for medical image segmentation, revealing that small test sets produce wide confidence intervals, which impacts the reliability of reported results.

Contribution

The paper systematically assesses the precision of performance estimates in medical image segmentation using both Gaussian and bootstrap methods, highlighting the limitations of small test sets.

Findings

01

Small test sets lead to wide confidence intervals (~8 Dice points for 20 samples).

02

Bootstrapping provides a distribution-free way to estimate confidence intervals.

03

Performance spread and test set size significantly affect estimate precision.

Abstract

An important issue in medical image processing is to be able to estimate not only the performances of algorithms but also the precision of the estimation of these performances. Reporting precision typically amounts to reporting standard-error of the mean (SEM) or equivalently confidence intervals. However, this is rarely done in medical image segmentation studies. In this paper, we aim to estimate what is the typical confidence that can be expected in such studies. To that end, we first perform experiments for Dice metric estimation using a standard deep learning model (U-net) and a classical task from the Medical Segmentation Decathlon. We extensively study precision estimation using both Gaussian assumption and bootstrapping (which does not require any assumption on the distribution). We then perform simulations for other test set sizes and performance spreads. Overall, our work shows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Medical Image Segmentation Techniques

MethodsTest