TL;DR
This paper introduces a new fairness notion called disparate mistreatment, focusing on equalizing misclassification rates across social groups, and proposes methods to incorporate this fairness into classifiers with minimal accuracy loss.
Contribution
It defines disparate mistreatment as a fairness criterion based on misclassification rates and develops convex constraints to enforce this fairness in decision boundary classifiers.
Findings
Effectively reduces disparate mistreatment in classifiers
Maintains high accuracy with minimal trade-offs
Works on both synthetic and real datasets
Abstract
Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
