Large-Scale Label Quality Assessment for Medical Segmentation via a Vision-Language Judge and Synthetic Data
Yixiong Chen, Zongwei Zhou, Wenxuan Li, Alan Yuille

TL;DR
This paper introduces SegAE, a vision-language model that automatically assesses label quality in large-scale medical segmentation datasets, improving data efficiency and reducing annotation costs.
Contribution
We develop SegAE, a lightweight model trained on over four million image-label pairs to evaluate label quality across 142 anatomical structures, enhancing dataset quality control.
Findings
SegAE achieves a correlation coefficient of 0.902 with ground-truth Dice scores.
SegAE evaluates a 3D mask in 0.06 seconds.
Using SegAE reduces annotation costs by one-third and quality-checking time by 70%.
Abstract
Large-scale medical segmentation datasets often combine manual and pseudo-labels of uneven quality, which can compromise training and evaluation. Low-quality labels may hamper performance and make the model training less robust. To address this issue, we propose SegAE (Segmentation Assessment Engine), a lightweight vision-language model (VLM) that automatically predicts label quality across 142 anatomical structures. Trained on over four million image-label pairs with quality scores, SegAE achieves a high correlation coefficient of 0.902 with ground-truth Dice similarity and evaluates a 3D mask in 0.06s. SegAE shows several practical benefits: (I) Our analysis reveals widespread low-quality labeling across public datasets; (II) SegAE improves data efficiency and training performance in active and semi-supervised learning, reducing dataset annotation cost by one-third and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · AI in cancer detection · COVID-19 diagnosis using AI
