TrustSkin: A Fairness Pipeline for Trustworthy Facial Affect Analysis Across Skin Tone
Ana M. Cabanas, Alma Pedro, Domingo Mery

TL;DR
This paper evaluates skin tone measurement methods for facial affect analysis fairness, revealing that perceptual measures like $H^*$-$L^*$ improve subgroup fairness diagnostics and highlight disparities affecting darker skin tones.
Contribution
It introduces a modular fairness pipeline using perceptual skin tone estimation, improving fairness assessment and diagnosis in facial affect analysis systems.
Findings
Dark skin tones are severely underrepresented (~2%) in datasets.
ITA method is sensitive to lighting, affecting fairness evaluation.
Perceptual $H^*$-$L^*$ method provides more consistent subgrouping and clearer fairness diagnostics.
Abstract
Understanding how facial affect analysis (FAA) systems perform across different demographic groups requires reliable measurement of sensitive attributes such as ancestry, often approximated by skin tone, which itself is highly influenced by lighting conditions. This study compares two objective skin tone classification methods: the widely used Individual Typology Angle (ITA) and a perceptually grounded alternative based on Lightness () and Hue (). Using AffectNet and a MobileNet-based model, we assess fairness across skin tone groups defined by each method. Results reveal a severe underrepresentation of dark skin tones (), alongside fairness disparities in F1-score (up to 0.08) and TPR (up to 0.11) across groups. While ITA shows limitations due to its sensitivity to lighting, the - method yields more consistent subgrouping and enables clearer diagnostics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Color perception and design
