Quality Sentinel: Estimating Label Quality and Errors in Medical Segmentation Datasets
Yixiong Chen, Zongwei Zhou, Alan Yuille

TL;DR
This paper introduces Quality Sentinel, a regression model that estimates label quality in medical segmentation datasets, enabling improved dataset analysis, annotation correction, and AI training efficiency.
Contribution
We developed a novel regression model trained on 4 million image-label pairs to accurately predict label quality across diverse medical segmentation datasets.
Findings
Strong correlation (r=0.902) between predicted and ground truth quality.
Identification and correction of poor annotations reduced costs by one-third.
Enhanced AI training with high-quality pseudo labels improved performance by 33-88%.
Abstract
An increasing number of public datasets have shown a transformative impact on automated medical segmentation. However, these datasets are often with varying label quality, ranging from manual expert annotations to AI-generated pseudo-annotations. There is no systematic, reliable, and automatic quality control (QC). To fill in this bridge, we introduce a regression model, Quality Sentinel, to estimate label quality compared with manual annotations in medical segmentation datasets. This regression model was trained on over 4 million image-label pairs created by us. Each pair presents a varying but quantified label quality based on manual annotations, which enable us to predict the label quality of any image-label pairs in the inference. Our Quality Sentinel can predict the label quality of 142 body structures. The predicted label quality quantified by Dice Similarity Coefficient (DSC)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Artificial Intelligence in Healthcare · Machine Learning in Healthcare
