FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions
Sebastian Schelter, Yuxuan He, Jatin Khilnani, Julia Stoyanovich

TL;DR
FairPrep is a framework designed to support data scientists in implementing fairness interventions by emphasizing data quality, proper tuning, and best practices, thereby improving fairness outcomes in machine learning.
Contribution
It introduces a system-level framework for fairness interventions, highlights shortcomings in existing studies, and demonstrates how best practices influence fairness effectiveness.
Findings
Hyperparameter tuning reduces variability in fairness outcomes.
Data cleaning methods significantly impact fairness interventions.
FairPrep enables measurement of best practice impacts on fairness.
Abstract
The importance of incorporating ethics and legal compliance into machine-assisted decision-making is broadly recognized. Further, several lines of recent work have argued that critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. Yet, very little has been done to date to provide system-level support to data scientists who wish to develop and deploy responsible machine learning methods. We aim to fill this gap and present FairPrep, a design and evaluation framework for fairness-enhancing interventions. FairPrep is based on a developer-centered design, and helps data scientists follow best practices in software engineering and machine learning. As part of our contribution, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
