The Skin Game: Revolutionizing Standards for AI Dermatology Model Comparison
{\L}ukasz Mi\k{e}tkiewicz, Leon Ciechanowski, Dariusz Jemielniak

TL;DR
This paper critically analyzes current practices in AI dermatology research, demonstrating the performance of a vision transformer model across datasets, and emphasizes the need for standardized evaluation protocols and methodological rigor.
Contribution
It offers a systematic review of methodological inconsistencies and proposes a comprehensive framework with recommendations for robust model evaluation and deployment in clinical dermatology.
Findings
DINOv2-Large achieved macro F1-scores of 0.85, 0.71, and 0.84 on three datasets.
Analysis reveals overestimated metrics due to data leakage and inconsistent reporting.
Attention maps show both sophisticated feature recognition and vulnerabilities in atypical cases.
Abstract
Deep Learning approaches in dermatological image classification have shown promising results, yet the field faces significant methodological challenges that impede proper evaluation. This paper presents a dual contribution: first, a systematic analysis of current methodological practices in skin disease classification research, revealing substantial inconsistencies in data preparation, augmentation strategies, and performance reporting; second, a comprehensive training and evaluation framework demonstrated through experiments with the DINOv2-Large vision transformer across three benchmark datasets (HAM10000, DermNet, ISIC Atlas). The analysis identifies concerning patterns, including pre-split data augmentation and validation-based reporting, potentially leading to overestimated metrics, while highlighting the lack of unified methodology standards. The experimental results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCutaneous Melanoma Detection and Management
MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · Dense Connections · Residual Connection · Multi-Head Attention · Vision Transformer
