Fair Machine Learning under Limited Demographically Labeled Data

Mustafa Safa Ozdayi; Murat Kantarcioglu; Rishabh Iyer

arXiv:2106.04757·cs.LG·April 12, 2022

Fair Machine Learning under Limited Demographically Labeled Data

Mustafa Safa Ozdayi, Murat Kantarcioglu, Rishabh Iyer

PDF

Open Access 1 Repo

TL;DR

This paper develops fair machine learning algorithms that perform well with minimal demographic data, addressing privacy concerns and demonstrating effectiveness even with only 0.1% demographic labels available.

Contribution

It introduces algorithms that balance utility and fairness when demographic labels are scarce, outperforming Rawlsian methods with very limited demographic information.

Findings

01

Algorithms outperform Rawlsian methods with 0.1% demographic data

02

Main algorithm is adaptable to multiple objectives

03

Extended to be robust against label noise

Abstract

Research has shown that, machine learning models might inherit and propagate undesired social biases encoded in the data. To address this problem, fair training algorithms are developed. However, most algorithms assume we know demographic/sensitive data features such as gender and race. This assumption falls short in scenarios where collecting demographic information is not feasible due to privacy concerns, and data protection policies. A recent line of work develops fair training methods that can function without any demographic feature on the data, that are collectively referred as Rawlsian methods. Yet, we show in experiments that, Rawlsian methods tend to exhibit relatively high bias. Given this, we look at the middle ground between the previous approaches, and consider a setting where we know the demographic attributes for only a small subset of our data. In such a setting, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TinfoilHat0/BiFair
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Ethics and Social Impacts of AI