TL;DR
This paper introduces a causal fairness method to evaluate how data preprocessing stages impact overall fairness in machine learning pipelines, revealing that certain transformations can cause unfairness and proposing ways to mitigate it.
Contribution
It is the first to measure and analyze the fairness impact of individual data preprocessing stages in ML pipelines using causal methods.
Findings
Certain data transformers induce unfairness in models.
Fairness patterns are identified across different transformer categories.
Local fairness impacts the global fairness of the pipeline.
Abstract
In recent years, many incidents have been reported where machine learning models exhibited discrimination among people based on race, sex, age, etc. Research has been conducted to measure and mitigate unfairness in machine learning models. For a machine learning task, it is a common practice to build a pipeline that includes an ordered set of data preprocessing stages followed by a classifier. However, most of the research on fairness has considered a single classifier based prediction task. What are the fairness impacts of the preprocessing stages in machine learning pipeline? Furthermore, studies showed that often the root cause of unfairness is ingrained in the data itself, rather than the model. But no research has been conducted to measure the unfairness caused by a specific transformation made in the data preprocessing stage. In this paper, we introduced the causal method of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
