Impact of Data Processing on Fairness in Supervised Learning
Sajad Khodadadian, AmirEmad Ghassami, Negar Kiyavash

TL;DR
This paper investigates how data pre- and post-processing techniques affect fairness in supervised learning, proposing a convex optimization-based pre-processing method and comparing it with post-processing, highlighting conditions where pre-processing is superior.
Contribution
It introduces a convex optimization framework for pre-processing to reduce discrimination and compares its effectiveness with post-processing methods, establishing efficiency and performance bounds.
Findings
Pre-processing can outperform post-processing under certain conditions.
A convex optimization approach enables efficient fairness adjustments.
Fundamental lower bounds on discrimination are derived for data processing methods.
Abstract
We study the impact of pre and post processing for reducing discrimination in data-driven decision makers. We first analyze the fundamental trade-off between fairness and accuracy in a pre-processing approach, and propose a design for a pre-processing module based on a convex optimization program, which can be added before the original classifier. This leads to a fundamental lower bound on attainable discrimination, given any acceptable distortion in the outcome. Furthermore, we reformulate an existing post-processing method in terms of our accuracy and fairness measures, which allows comparing post-processing and pre-processing approaches. We show that under some mild conditions, pre-processing outperforms post-processing. Finally, we show that by appropriate choice of the discrimination measure, the optimization problem for both pre and post processing approaches will reduce to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
