A survey on datasets for fairness-aware machine learning

Tai Le Quy; Arjun Roy; Vasileios Iosifidis; Wenbin Zhang; Eirini; Ntoutsi

arXiv:2110.00530·cs.LG·March 8, 2022

A survey on datasets for fairness-aware machine learning

Tai Le Quy, Arjun Roy, Vasileios Iosifidis, Wenbin Zhang, Eirini, Ntoutsi

PDF

1 Repo

TL;DR

This survey reviews real-world tabular datasets used in fairness-aware machine learning, analyzing their attributes and biases to support empirical evaluation of fairness interventions.

Contribution

It provides a comprehensive overview of datasets, explores attribute relationships, and investigates biases to aid fair ML research and benchmarking.

Findings

01

Identifies relationships between dataset attributes using Bayesian networks

02

Analyzes bias and attribute interactions through exploratory analysis

03

Highlights the importance of diverse datasets for fairness evaluation

Abstract

As decision-making increasingly relies on Machine Learning (ML) and (big) data, the issue of fairness in data-driven Artificial Intelligence (AI) systems is receiving increasing attention from both research and industry. A large variety of fairness-aware machine learning solutions have been proposed which involve fairness-related interventions in the data, learning algorithms and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware machine learning. We focus on tabular data as the most common data representation for fairness-aware machine learning. We start our analysis by identifying relationships between the different attributes, particularly w.r.t. protected attributes and class attribute,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tailequy/fairness_dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.