Bias in Machine Learning Software: Why? How? What to do?

Joymallya Chakraborty; Suvodeep Majumder; Tim Menzies

arXiv:2105.12195·cs.LG·July 12, 2021

Bias in Machine Learning Software: Why? How? What to do?

Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies

PDF

2 Repos

TL;DR

This paper investigates the root causes of bias in machine learning models used in critical decision-making, proposing a new algorithm called Fair-SMOTE that effectively reduces bias while improving model performance.

Contribution

It introduces the Fair-SMOTE algorithm that addresses bias by rebalancing data and labels based on root causes, demonstrating comparable bias reduction and higher accuracy than existing methods.

Findings

01

Fair-SMOTE effectively reduces bias similar to prior methods.

02

Fair-SMOTE improves recall and F1 scores over state-of-the-art algorithms.

03

Study is among the largest on bias mitigation across multiple datasets and learners.

Abstract

Increasingly, software is making autonomous decisions in case of criminal sentencing, approving credit cards, hiring employees, and so on. Some of these decisions show bias and adversely affect certain social groups (e.g. those defined by sex, race, age, marital status). Many prior works on bias mitigation take the following form: change the data or learners in multiple ways, then see if any of that improves fairness. Perhaps a better approach is to postulate root causes of bias and then applying some resolution strategy. This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples. Our Fair-SMOTE algorithm removes biased labels; and rebalances internal distributions such that based on sensitive attribute, examples are equal in both positive and negative classes. On testing, it was seen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.