Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild
Vitor Pereira Matias, M\'arcus Vin\'icius Lobo Costa, Jo\~ao Batista Neto, Tiago Novello de Brito

TL;DR
This paper introduces a large, annotated skin tone dataset and benchmarks various models, demonstrating deep learning's effectiveness and enabling fairness auditing in skin tone classification.
Contribution
The work provides a comprehensive skin tone dataset, benchmarks existing methods, and proposes SkinToneNet for improved fairness assessment and generalization.
Findings
Deep learning models outperform classic methods in skin tone classification.
The dataset enables more accurate fairness analysis across diverse populations.
SkinToneNet achieves state-of-the-art generalization on out-of-domain data.
Abstract
Deep learning models often inherit biases from their training data. While fairness across gender and ethnicity is well-studied, fine-grained skin tone analysis remains a challenge due to the lack of granular, annotated datasets. Existing methods often rely on the medical 6-tone Fitzpatrick scale, which lacks visual representativeness, or use small, private datasets that prevent reproducibility, or often rely on classic computer vision pipelines, with a few using deep learning. They overlook issues like train-test leakage and dataset imbalance, and are limited by small or unavailable datasets. In this work, we present a comprehensive framework for skin tone fairness. First, we introduce the STW, a large-scale, open-access dataset comprising 42,313 images from 3,564 individuals, labeled using the 10-tone MST scale. Second, we benchmark both Classic Computer Vision (SkinToneCCV) and Deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSkin Protection and Aging · Cutaneous Melanoma Detection and Management · Generative Adversarial Networks and Image Synthesis
