Optimized Data Pre-Processing for Discrimination Prevention
Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, and Kush, R. Varshney

TL;DR
This paper presents a convex optimization approach for data pre-processing aimed at reducing discrimination in algorithmic decision-making while maintaining data utility and limiting sample distortion.
Contribution
It introduces a novel probabilistic formulation and convex optimization method for data transformation that balances discrimination control, utility preservation, and sample distortion.
Findings
Simultaneous achievement of discrimination reduction, utility preservation, and low distortion.
Application to real-world criminal recidivism data shows effectiveness.
Reveals societal bias patterns through data analysis.
Abstract
Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective, and apply two instances of the proposed optimization to datasets, including one on real-world criminal recidivism. The results demonstrate that all three criteria can be simultaneously achieved and also reveal interesting patterns of bias in American society.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI
