Comparison of computer systems and ranking criteria for automatic melanoma detection in dermoscopic images
Kajsa M{\o}llersen, Maciel Zortea, Thomas R. Schopf, Herbert, Kirchesch, Fred Godtliebsen

TL;DR
This study evaluates how different ranking criteria, segmentation methods, and classifiers affect the performance assessment of computer systems for melanoma detection in dermoscopic images, emphasizing the importance of high-sensitivity measures for clinical relevance.
Contribution
It systematically compares ranking measures and classifiers, revealing their impact on system performance evaluation and highlighting the limited effect of segmentation improvements.
Findings
Ranking criteria significantly influence system rankings.
Classifier choice impacts diagnostic accuracy more than segmentation method.
High-sensitivity measures are crucial for clinical application.
Abstract
Melanoma is the deadliest form of skin cancer. Computer systems can assist in melanoma detection, but are not widespread in clinical practice. In 2016, an open challenge in classification of dermoscopic images of skin lesions was announced. A training set of 900 images with corresponding class labels and semi-automatic/manual segmentation masks was released for the challenge. An independent test set of 379 images was used to rank the participants. This article demonstrates the impact of ranking criteria, segmentation method and classifier, and highlights the clinical perspective. We compare five different measures for diagnostic accuracy by analysing the resulting ranking of the computer systems in the challenge. Choice of performance measure had great impact on the ranking. Systems that were ranked among the top three for one measure, dropped to the bottom half when changing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
