Understanding implementation pitfalls of distance-based metrics for image segmentation
Gasper Podobnik, Tomaz Vrtovec

TL;DR
This paper critically examines the implementation differences of distance-based metrics like Hausdorff distance in image segmentation, revealing significant discrepancies that affect benchmarking and clinical applications.
Contribution
It provides a systematic analysis of 11 open-source tools, highlighting implementation discrepancies and their impact on segmentation validation.
Findings
Deviations in Hausdorff distance exceeding 100 mm observed
Statistically significant differences between tools identified
Implementation choices can lead to false impressions of performance improvements
Abstract
Distance-based metrics, such as the Hausdorff distance (HD), are widely used to validate segmentation performance in (bio)medical imaging. However, their implementation is complex, and critical differences across open-source tools remain largely unrecognized by the community. These discrepancies undermine benchmarking efforts, introduce bias in biomarker calculations, and potentially distort medical device development and clinical commissioning. In this study, we systematically dissect 11 open-source tools that implement distance-based metric computation by performing both a conceptual analysis of their computational steps and an empirical analysis on representative two- and three-dimensional image datasets. Alarmingly, we observed deviations in HD exceeding 100 mm and identified multiple statistically significant differences between tools - demonstrating that statistically significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging
