TL;DR
This paper addresses the challenge of ensuring fair subset selection in AI applications when protected attributes are noisy, proposing a denoising approach with approximation algorithms that improves fairness without sacrificing utility.
Contribution
It introduces a novel denoising framework for fair subset selection under noisy protected attributes, including a linear-programming approximation algorithm and empirical validation.
Findings
Significant fairness improvements with noisy attributes
Better utility-fairness tradeoffs than prior methods
Effective on both synthetic and real-world data
Abstract
Subset selection algorithms are ubiquitous in AI-driven applications, including, online recruiting portals and image search engines, so it is imperative that these tools are not discriminatory on the basis of protected attributes such as gender or race. Currently, fair subset selection algorithms assume that the protected attributes are known as part of the dataset. However, protected attributes may be noisy due to errors during data collection or if they are imputed (as is often the case in real-world settings). While a wide body of work addresses the effect of noise on the performance of machine learning algorithms, its effect on fairness remains largely unexamined. We find that in the presence of noisy protected attributes, in attempting to increase fairness without considering noise, one can, in fact, decrease the fairness of the result! Towards addressing this, we consider an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
