When AI and Experts Agree on Error: Intrinsic Ambiguity in Dermatoscopic Images

Loris Cino; Pier Luigi Mazzeo; Alessandro Martella; Giulia Radi; Renato Rossi; Cosimo Distante

arXiv:2604.00651·cs.CV·April 2, 2026

When AI and Experts Agree on Error: Intrinsic Ambiguity in Dermatoscopic Images

Loris Cino, Pier Luigi Mazzeo, Alessandro Martella, Giulia Radi, Renato Rossi, Cosimo Distante

PDF

1 Repo

TL;DR

This study reveals that certain dermatoscopic images are intrinsically ambiguous, causing both AI models and human experts to systematically fail, highlighting fundamental limits in dermatological diagnosis.

Contribution

It introduces a novel analysis of intrinsic image ambiguity affecting AI and human experts, supported by experiments and open data for reproducibility.

Findings

01

AI models consistently misclassify a subset of images beyond chance.

02

Expert dermatologists' diagnostic accuracy drops significantly on difficult images.

03

Image quality is identified as a key factor in intrinsic diagnostic ambiguity.

Abstract

The integration of artificial intelligence (AI), particularly Convolutional Neural Networks (CNNs), into dermatological diagnosis demonstrates substantial clinical potential. While existing literature predominantly benchmarks algorithmic performance against human experts, our study adopts a novel perspective by investigating the intrinsic complexity of dermatoscopic images. Through rigorous experimentation with multiple CNN architectures, we isolated a subset of images systematically misclassified across all models-a phenomenon statistically proven to exceed random chance. To determine if these failures stem from algorithmic biases or inherent visual ambiguity, expert dermatologists independently evaluated these challenging cases alongside a control group. The results revealed a collapse in human diagnostic performance on the AI-misclassified images. First, agreement with ground-truth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.