# Generative Adversarial Networks for Mitigating Biases in Machine   Learning Systems

**Authors:** Adel Abusitta, Esma A\"imeur, Omar Abdel Wahab

arXiv: 1905.09972 · 2019-05-27

## TL;DR

This paper introduces a cGAN-based framework to generate synthetic fair data, effectively reducing biases in machine learning systems while improving their accuracy, addressing limitations of existing model-focused bias mitigation methods.

## Contribution

The work presents a novel cGAN-based approach for bias mitigation that generates synthetic fair data, overcoming accuracy degradation and training time issues of prior methods.

## Key findings

- Effective bias mitigation across multiple bias types
- Enhanced prediction accuracy with synthetic data
- Reduced training time for fair models

## Abstract

In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to produce fair results, while overlooking the fact that the training data can itself be the main reason for biased outcomes. Technically speaking, two essential limitations can be found in such model-based approaches: 1) the mitigation cannot be achieved without degrading the accuracy of the machine learning models, and 2) when the data used for training are largely biased, the training time automatically increases so as to find suitable learning parameters that help produce fair results. To address these shortcomings, we propose in this work a new framework that can largely mitigate the biases and discriminations in machine learning systems while at the same time enhancing the prediction accuracy of these systems. The proposed framework is based on conditional Generative Adversarial Networks (cGANs), which are used to generate new synthetic fair data with selective properties from the original data. We also propose a framework for analyzing data biases, which is important for understanding the amount and type of data that need to be synthetically sampled and labeled for each population group. Experimental results show that the proposed solution can efficiently mitigate different types of biases, while at the same time enhancing the prediction accuracy of the underlying machine learning model.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.09972/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1905.09972/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1905.09972/full.md

---
Source: https://tomesphere.com/paper/1905.09972