TL;DR
This paper formalizes proxy discrimination in machine learning systems as causal correlations with protected classes, and demonstrates methods to detect and repair such biases in social datasets.
Contribution
It introduces a formal definition of proxy discrimination and provides practical validation and repair techniques for biased data-driven systems.
Findings
Validated the approach on social datasets
Demonstrated detection of proxy bias
Showed how to repair bias violations
Abstract
Machine learnt systems inherit biases against protected classes, historically disparaged groups, from training data. Usually, these biases are not explicit, they rely on subtle correlations discovered by training algorithms, and are therefore difficult to detect. We formalize proxy discrimination in data-driven systems, a class of properties indicative of bias, as the presence of protected class correlates that have causal influence on the system's output. We evaluate an implementation on a corpus of social datasets, demonstrating how to validate systems against these properties and to repair violations where they occur.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
