# Prospective Evidence on Artificial Intelligence−Assisted Melanoma Diagnostics: A Systematic Review and Meta-Analysis

**Authors:** Sara Laiouar-Pedari, Arlene Kühn, Christoph Wies, Carina Nogueira Garcia, Jana Therés Winterstein, Lukas Heinlein, Annemarie Hoffsommer, Tirtha Chanda, Sarah Haggenmüller, Titus J. Brinker

PMC · DOI: 10.1001/jamadermatol.2026.0217 · 2026-03-25

## TL;DR

AI systems perform similarly to dermatologists in diagnosing melanoma using dermoscopy, but more rigorous studies are needed to confirm their clinical utility.

## Contribution

This study provides a systematic review and meta-analysis of prospective evidence comparing AI and dermatologists in melanoma diagnostics.

## Key findings

- AI and dermatologists showed comparable diagnostic performance in melanoma detection.
- AI-assisted dermatologists demonstrated higher sensitivity and specificity in one study.
- Most studies had a high risk of bias, limiting generalizability.

## Abstract

How does the diagnostic performance of artificial intelligence (AI) for melanoma in prospective dermoscopy studies compare with that of dermatologists?

Across 11 prospective studies including more than 2500 participants, AI and dermatologists showed comparable diagnostic performance. However, the evidence base remains small, and study designs are heterogeneous, with a high risk of bias in patient selection and index test domains.

Although current findings support the potential clinical application of AI, validation remains at an early stage because larger, multicenter, and methodologically rigorous prospective studies are required to confirm the safety and clinical utility of AI in routine practice.

Dermoscopy is a standard of care for melanoma diagnostics, and artificial intelligence (AI) systems are increasingly investigated as decision-support tools. Prospective evidence is essential to assess their performance compared to dermatologists.

To evaluate the diagnostic performance of dermatologists, AI systems, and dermatologists assisted by AI in prospective studies of melanoma detection, and to assess the readiness of AI for clinical use.

PubMed, Embase, Web of Science, and Google Scholar were searched from inception through July 9, 2025.

Eligible studies were prospective, used dermoscopic images, and reported or allowed calculation of performance metrics for dermatologists, AI, or dermatologists assisted by AI against a histopathologic reference standard. Nondermoscopic comparators and retrospective designs were excluded. Studies with 20 or fewer histopathologically confirmed melanomas were excluded a priori from quantitative synthesis.

Two reviewers independently screened and extracted data and discrepancies or missing values were clarified among all authors. Risk of bias and applicability were assessed with QUADAS-2 and QUADAS-C. Study-level sensitivity and specificity were summarized and plotted; head-to-head comparisons were analyzed descriptively.

Diagnostic outcomes were sensitivity, specificity, accuracy, and balanced accuracy for melanoma detection.

Eleven prospective studies with a total of more than 2500 patients and 50 participant-dermatologists were included in the analyses. Dermatologists achieved a pooled sensitivity of 78.6% (95% CI, 67.5%-88.1%) and specificity of 75.2% (95% CI, 63.3%-84.3%), whereas AI alone reached 80.9% (95% CI, 63.6%-94.5%) sensitivity and 75.6% (95% CI, 64.5%-85.6%) specificity. In the single study reporting AI-assisted dermatologists, sensitivity was 91.9% and specificity was 83.7%. In direct clinical comparisons, AI demonstrated higher specificity and similar sensitivity. Most studies were at high risk of bias in patient selection and index test domains, primarily due to the preselection of lesions suspected of melanoma and binary classifications.

In the systematic review and meta-analysis of prospective settings, AI systems perform at comparable levels to dermatologists for melanoma diagnostics and may enhance performance when used as a decision-support tool. However, the frequent risk of bias and limited generalizability of current studies highlight the need for broader validation in unselected patient populations in the clinical setting.

This systematic review and meta-analysis evaluates the diagnostic performance of dermatologists with and without artificial intelligence support in studies of melanoma detection.

## Linked entities

- **Diseases:** melanoma (MONDO:0005105)

## Full-text entities

- **Diseases:** Melanoma (MESH:D008545)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13019344/full.md

---
Source: https://tomesphere.com/paper/PMC13019344