Disparities in Dermatology AI Performance on a Diverse, Curated Clinical Image Set
Roxana Daneshjou, Kailas Vodrahalli, Roberto A Novoa, Melissa Jenkins,, Weixin Liang, Veronica Rotemberg, Justin Ko, Susan M Swetter, Elizabeth E, Bailey, Olivier Gevaert, Pritam Mukherjee, Michelle Phung, Kiana Yekrang,, Bradley Fong, Rachna Sahasrabudhe, Johan A. C. Allerup

TL;DR
This study reveals significant biases in dermatology AI models, showing reduced performance on diverse skin tones and rare diseases, and demonstrates that fine-tuning on diverse datasets can improve accuracy and fairness.
Contribution
The paper introduces the DDI dataset, a curated, diverse dermatology image set, and evaluates AI performance disparities, highlighting the need for bias mitigation in dermatology AI.
Findings
AI models perform 27-36% worse on diverse images
Models and dermatologists perform worse on dark skin and rare diseases
Fine-tuning on diverse data improves AI fairness and accuracy
Abstract
Access to dermatological care is a major issue, with an estimated 3 billion people lacking access to care globally. Artificial intelligence (AI) may aid in triaging skin diseases. However, most AI models have not been rigorously assessed on images of diverse skin tones or uncommon diseases. To ascertain potential biases in algorithm performance in this context, we curated the Diverse Dermatology Images (DDI) dataset-the first publicly available, expertly curated, and pathologically confirmed image dataset with diverse skin tones. Using this dataset of 656 images, we show that state-of-the-art dermatology AI models perform substantially worse on DDI, with receiver operator curve area under the curve (ROC-AUC) dropping by 27-36 percent compared to the models' original test results. All the models performed worse on dark skin tones and uncommon diseases, which are represented in the DDI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
