Fair Preprocessing: Towards Understanding Compositional Fairness of Data   Transformers in Machine Learning Pipeline

Sumon Biswas; Hridesh Rajan

arXiv:2106.06054·cs.LG·July 21, 2021

Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline

Sumon Biswas, Hridesh Rajan

PDF

1 Repo

TL;DR

This paper introduces a causal fairness method to evaluate how data preprocessing stages impact overall fairness in machine learning pipelines, revealing that certain transformations can cause unfairness and proposing ways to mitigate it.

Contribution

It is the first to measure and analyze the fairness impact of individual data preprocessing stages in ML pipelines using causal methods.

Findings

01

Certain data transformers induce unfairness in models.

02

Fairness patterns are identified across different transformer categories.

03

Local fairness impacts the global fairness of the pipeline.

Abstract

In recent years, many incidents have been reported where machine learning models exhibited discrimination among people based on race, sex, age, etc. Research has been conducted to measure and mitigate unfairness in machine learning models. For a machine learning task, it is a common practice to build a pipeline that includes an ordered set of data preprocessing stages followed by a classifier. However, most of the research on fairness has considered a single classifier based prediction task. What are the fairness impacts of the preprocessing stages in machine learning pipeline? Furthermore, studies showed that often the root cause of unfairness is ingrained in the data itself, rather than the model. But no research has been conducted to measure the unfairness caused by a specific transformation made in the data preprocessing stage. In this paper, we introduced the causal method of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sumonbis/FairPreprocessing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.