Understanding metric-related pitfalls in image analysis validation
Annika Reinke, Minu D. Tizabi, Michael Baumgartner, Matthias, Eisenmann, Doreen Heckmann-N\"otzel, A. Emre Kavur, Tim R\"adsch, Carole H., Sudre, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel, Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso

TL;DR
This paper provides a comprehensive, accessible resource on common pitfalls in validation metrics for image analysis, aiming to improve research reliability and cross-disciplinary understanding.
Contribution
It introduces a structured, expert-validated taxonomy of pitfalls in validation metrics, with illustrative examples, enhancing accessibility for researchers across disciplines.
Findings
Identified key pitfalls in validation metrics for image analysis.
Developed a comprehensive taxonomy of pitfalls.
Provided illustrative examples for each pitfall.
Abstract
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Delphi Technique in Research · Artificial Intelligence in Healthcare and Education
