# MST-AI: Skin Color Estimation in Skin Cancer Datasets

**Authors:** Vahid Khalkhali, Hayan Lee, Joseph Nguyen, Sergio Zamora-Erazo, Camille Ragin, Abhishek Aphale, Alfonso Bellacosa, Ellis P. Monk, Saroj K. Biswas

PMC · DOI: 10.3390/jimaging11070235 · 2025-07-13

## TL;DR

This paper introduces MST-AI, a new method to estimate skin color in skin cancer datasets to improve AI diagnosis accuracy for diverse populations.

## Contribution

The novel MST-AI method uses the Monk Skin Tone scale and advanced AI techniques to address skin color bias in skin cancer datasets.

## Key findings

- MST-AI outperformed K-means clustering with Kendall’s Tau, Spearman’s Rho, and NDGC scores of 0.68, 0.69, and 1.00.
- The method successfully modeled normal skin tones using a Variational Bayesian Gaussian Mixture Model.
- MST-AI provides a foundation for unbiased AI models in early skin cancer diagnosis.

## Abstract

The absence of skin color information in skin cancer datasets poses a significant challenge for accurate diagnosis using artificial intelligence models, particularly for non-white populations. In this paper, based on the Monk Skin Tone (MST) scale, which is less biased than the Fitzpatrick scale, we propose MST-AI, a novel method for detecting skin color in images of large datasets, such as the International Skin Imaging Collaboration (ISIC) archive. The approach includes automatic frame, lesion removal, and lesion segmentation using convolutional neural networks, and modeling normal skin tones with a Variational Bayesian Gaussian Mixture Model (VB-GMM). The distribution of skin color predictions was compared with MST scale probability distribution functions (PDFs) using the Kullback-Leibler Divergence (KLD) metric. Validation against manual annotations and comparison with K-means clustering of image and skin mean RGBs demonstrated the superior performance of the MST-AI, with Kendall’s Tau, Spearman’s Rho, and Normalized Discounted Cumulative Gain (NDGC) of 0.68, 0.69, and 1.00, respectively. This research lays the groundwork for developing unbiased AI models for early skin cancer diagnosis by addressing skin color imbalances in large datasets.

## Linked entities

- **Diseases:** skin cancer (MONDO:0002898)

## Full-text entities

- **Diseases:** Skin Cancer (MESH:D012878), lesion (MESH:D009059)

## Figures

31 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12295582/full.md

---
Source: https://tomesphere.com/paper/PMC12295582