Good Enough: Is it Worth Improving your Label Quality?
Alexander Jaus, Zdravko Marinov, Constantin Seibold, Simon Rei{\ss}, Jens Kleesiek, Rainer Stiefelhagen

TL;DR
This paper systematically evaluates the impact of label quality improvements in medical image segmentation, revealing that benefits are limited below a certain quality threshold and that pre-training is minimally affected by label quality.
Contribution
It provides empirical evidence on the conditions under which improving label quality is beneficial in medical image segmentation tasks.
Findings
Higher-quality labels improve in-domain performance
Minimal impact of label quality on pre-training
Benefits diminish below a small quality threshold
Abstract
Improving label quality in medical image segmentation is costly, but its benefits remain unclear. We systematically evaluate its impact using multiple pseudo-labeled versions of CT datasets, generated by models like nnU-Net, TotalSegmentator, and MedSAM. Our results show that while higher-quality labels improve in-domain performance, gains remain unclear if below a small threshold. For pre-training, label quality has minimal impact, suggesting that models rather transfer general concepts than detailed annotations. These findings provide guidance on when improving label quality is worth the effort.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClinical practice guidelines implementation
