Mitigating Bias in Set Selection with Noisy Protected Attributes

Anay Mehrotra; L. Elisa Celis

arXiv:2011.04219·cs.CY·February 23, 2021

Mitigating Bias in Set Selection with Noisy Protected Attributes

Anay Mehrotra, L. Elisa Celis

PDF

2 Repos

TL;DR

This paper addresses the challenge of ensuring fair subset selection in AI applications when protected attributes are noisy, proposing a denoising approach with approximation algorithms that improves fairness without sacrificing utility.

Contribution

It introduces a novel denoising framework for fair subset selection under noisy protected attributes, including a linear-programming approximation algorithm and empirical validation.

Findings

01

Significant fairness improvements with noisy attributes

02

Better utility-fairness tradeoffs than prior methods

03

Effective on both synthetic and real-world data

Abstract

Subset selection algorithms are ubiquitous in AI-driven applications, including, online recruiting portals and image search engines, so it is imperative that these tools are not discriminatory on the basis of protected attributes such as gender or race. Currently, fair subset selection algorithms assume that the protected attributes are known as part of the dataset. However, protected attributes may be noisy due to errors during data collection or if they are imputed (as is often the case in real-world settings). While a wide body of work addresses the effect of noise on the performance of machine learning algorithms, its effect on fairness remains largely unexamined. We find that in the presence of noisy protected attributes, in attempting to increase fairness without considering noise, one can, in fact, decrease the fairness of the result! Towards addressing this, we consider an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.