Using Self-supervised Learning Can Improve Model Fairness

Sofia Yfantidou; Dimitris Spathis; Marios Constantinides; Athena; Vakali; Daniele Quercia; Fahim Kawsar

arXiv:2406.02361·cs.LG·June 5, 2024

Using Self-supervised Learning Can Improve Model Fairness

Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena, Vakali, Daniele Quercia, Fahim Kawsar

PDF

Open Access 1 Repo

TL;DR

This paper investigates how self-supervised learning (SSL) can enhance model fairness across demographic groups, showing that SSL can significantly improve fairness with minimal performance loss compared to supervised methods.

Contribution

The study introduces a comprehensive fairness assessment framework for SSL and demonstrates that SSL models can achieve up to 30% greater fairness across real-world datasets.

Findings

01

SSL improves fairness up to 30% with minimal performance loss

02

Representation dissimilarities correlate with demographic performance gaps

03

SSL models maintain comparable performance to supervised models

Abstract

Self-supervised learning (SSL) has become the de facto training paradigm of large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite demonstrating comparable performance with supervised methods, comprehensive efforts to assess SSL's impact on machine learning fairness (i.e., performing equally on different demographic breakdowns) are lacking. Hypothesizing that SSL models would learn more generic, hence less biased representations, this study explores the impact of pre-training and fine-tuning strategies on fairness. We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes. We evaluate our method's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Nokia-Bell-Labs/SSLfairness
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Process Automation Applications · Machine Learning and Data Classification