Optimized Data Pre-Processing for Discrimination Prevention

Flavio P. Calmon; Dennis Wei; Karthikeyan Natesan Ramamurthy; and Kush; R. Varshney

arXiv:1704.03354·stat.ML·April 12, 2017·40 cites

Optimized Data Pre-Processing for Discrimination Prevention

Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, and Kush, R. Varshney

PDF

Open Access 1 Repo

TL;DR

This paper presents a convex optimization approach for data pre-processing aimed at reducing discrimination in algorithmic decision-making while maintaining data utility and limiting sample distortion.

Contribution

It introduces a novel probabilistic formulation and convex optimization method for data transformation that balances discrimination control, utility preservation, and sample distortion.

Findings

01

Simultaneous achievement of discrimination reduction, utility preservation, and low distortion.

02

Application to real-world criminal recidivism data shows effectiveness.

03

Reveals societal bias patterns through data analysis.

Abstract

Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective, and apply two instances of the proposed optimization to datasets, including one on real-world criminal recidivism. The results demonstrate that all three criteria can be simultaneously achieved and also reveal interesting patterns of bias in American society.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fair-preprocessing/nips2017
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI