Performance evaluation of deep learning models for image analysis: considerations for visual control and statistical metrics
Christof A. Bertram, Jonas Ammeling, Alexander Bartel, Gillian Beamer, Marc Aubreville

TL;DR
This paper reviews methods for evaluating deep learning models in image analysis, emphasizing the importance of combining visual and statistical performance assessments to ensure reliability and robustness in practical applications.
Contribution
It compares visual and statistical performance control methods, highlighting their strengths and advocating for a combined approach for thorough model evaluation.
Findings
Visual and statistical evaluations are complementary.
Combining both methods yields the most comprehensive performance insight.
Proper dataset and metric selection are critical for reliable assessment.
Abstract
Deep learning-based automated image analysis (DL-AIA) has been shown to outperform trained pathologists in tasks related to feature quantification. Related to these capacities the use of DL-AIA tools is currently extending from proof-of-principle studies to routine applications such as patient samples (diagnostic pathology), regulatory safety assessment (toxicologic pathology), and recurrent research tasks. To ensure that DL-AIA applications are safe and reliable, it is critical to conduct a thorough and objective generalization performance assessment (i.e., the ability of the algorithm to accurately predict patterns of interest) and possibly evaluate model robustness (i.e., the algorithm's capacity to maintain predictive accuracy on images from different sources). In this article, we review the practices for performance assessment in veterinary pathology publications by which two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Cell Image Analysis Techniques · Digital Imaging for Blood Diseases
