FairJob: A Real-World Dataset for Fairness in Online Systems

Mariia Vladimirova; Federico Pavone; Eustache Diemert

arXiv:2407.03059·cs.LG·November 5, 2024

FairJob: A Real-World Dataset for Fairness in Online Systems

Mariia Vladimirova, Federico Pavone, Eustache Diemert

PDF

Open Access 1 Repo 2 Datasets 1 Video

TL;DR

FairJob introduces a real-world, privacy-compliant dataset for studying fairness in online job recommendation systems, enabling research on bias mitigation with practical implications.

Contribution

The paper presents a novel, anonymized dataset with proxy attributes for fairness research in advertising, along with methods to evaluate and improve fairness in biased datasets.

Findings

01

Bias mitigation techniques show potential to improve fairness.

02

Trade-offs exist between fairness and utility in recommendations.

03

The dataset provides a realistic benchmark for fairness research.

Abstract

We introduce a fairness-aware dataset for job recommendations in advertising, designed to foster research in algorithmic fairness within real-world scenarios. It was collected and prepared to comply with privacy standards and business confidentiality. An additional challenge is the lack of access to protected user attributes such as gender, for which we propose a solution to obtain a proxy estimate. Despite being anonymized and including a proxy for a sensitive attribute, our dataset preserves predictive power and maintains a realistic and challenging benchmark. This dataset addresses a significant gap in the availability of fairness-focused resources for high-impact domains like advertising -- the actual impact being having access or not to precious employment opportunities, where balancing fairness and utility is a common industrial challenge. We also explore various stages in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

criteo-research/fairjob-dataset
pytorchOfficial

Datasets

Videos

FairJob: A Real-World Dataset for Fairness in Online Systems· slideslive

Taxonomy

TopicsEthics and Social Impacts of AI