Optimal Pre-Processing to Achieve Fairness and Its Relationship with Total Variation Barycenter
Farhad Farokhi

TL;DR
This paper establishes a theoretical framework linking fairness, utility, and data pre-processing through total variation distances, and proposes an efficient linear programming approach to optimize fairness while controlling utility loss.
Contribution
It introduces a novel linear programming formulation for optimal pre-processing to enforce fairness based on total variation bounds, connecting fairness with barycenters of distributions.
Findings
Disparate impact is bounded by total variation distance between input distributions.
Utility degradation is bounded by total variation distance after pre-processing.
Optimal pre-processing can be efficiently computed via linear programming.
Abstract
We use disparate impact, i.e., the extent that the probability of observing an output depends on protected attributes such as race and gender, to measure fairness. We prove that disparate impact is upper bounded by the total variation distance between the distribution of the inputs given the protected attributes. We then use pre-processing, also known as data repair, to enforce fairness. We show that utility degradation, i.e., the extent that the success of a forecasting model changes by pre-processing the data, is upper bounded by the total variation distance between the distribution of the data before and after pre-processing. Hence, the problem of finding the optimal pre-processing regiment for enforcing fairness can be cast as minimizing total variations distance between the distribution of the data before and after pre-processing subject to a constraint on the total variation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
