On Comparing Fair Classifiers under Data Bias
Mohit Sharma, Amit Deshpande, Rajiv Ratn Shah

TL;DR
This paper investigates how data bias affects the fairness and accuracy of classifiers, revealing that many fair classifiers degrade under bias, but simple techniques can maintain stability, with practical implications for fairness monitoring.
Contribution
It provides a comprehensive empirical analysis of the impact of data bias on fair classifiers and evaluates simple bias mitigation techniques across multiple datasets.
Findings
Fair classifiers' fairness and accuracy degrade with increased data bias.
A logistic regression on unbiased data can outperform biased fair classifiers.
Simple techniques like reweighing offer stable fairness and accuracy under bias.
Abstract
In this paper, we consider a theoretical model for injecting data bias, namely, under-representation and label bias (Blum & Stangl, 2019). We empirically study the effect of varying data biases on the accuracy and fairness of fair classifiers. Through extensive experiments on both synthetic and real-world datasets (e.g., Adult, German Credit, Bank Marketing, COMPAS), we empirically audit pre-, in-, and post-processing fair classifiers from standard fairness toolkits for their fairness and accuracy by injecting varying amounts of under-representation and label bias in their training data (but not the test data). Our main observations are: 1. The fairness and accuracy of many standard fair classifiers degrade severely as the bias injected in their training data increases, 2. A simple logistic regression model trained on the right data can often outperform, in both accuracy and fairness,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsTest · Logistic Regression
