Training individually fair ML models with Sensitive Subspace Robustness
Mikhail Yurochkin, Amanda Bower, Yuekai Sun

TL;DR
This paper introduces a distributionally robust optimization method to train machine learning models that maintain fairness by being invariant to sensitive input perturbations, addressing biases related to gender and ethnicity.
Contribution
It formalizes a new fairness criterion based on sensitive subspace robustness and proposes an optimization approach to enforce it during training.
Findings
Effective in reducing gender and racial biases in ML tasks
Improves fairness without sacrificing overall performance
Demonstrates robustness against sensitive attribute perturbations
Abstract
We consider training machine learning models that are fair in the sense that their performance is invariant under certain sensitive perturbations to the inputs. For example, the performance of a resume screening system should be invariant under changes to the gender and/or ethnicity of the applicant. We formalize this notion of algorithmic fairness as a variant of individual fairness and develop a distributionally robust optimization approach to enforce it during training. We also demonstrate the effectiveness of the approach on two ML tasks that are susceptible to gender and racial biases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Privacy-Preserving Technologies in Data · Ethics and Social Impacts of AI
