Achieving non-discrimination in data release

Lu Zhang (1); Yongkai Wu (1); Xintao Wu (1) ((1) University of; Arkansas)

arXiv:1611.07438·cs.LG·November 23, 2016·1 cites

Achieving non-discrimination in data release

Lu Zhang (1), Yongkai Wu (1), Xintao Wu (1) ((1) University of, Arkansas)

PDF

Open Access

TL;DR

This paper introduces a method for detecting and removing discrimination in data sets by identifying meaningful partitions through causal graphs, ensuring fairer data for predictive analysis.

Contribution

It presents a novel graphical condition for meaningful partition identification and algorithms for discrimination removal that preserve data utility.

Findings

01

Effective discrimination detection and removal demonstrated on real datasets

02

Algorithms accurately remove discrimination while maintaining data utility

03

Supports fair decision-making in data mining applications

Abstract

Discrimination discovery and prevention/removal are increasingly important tasks in data mining. Discrimination discovery aims to unveil discriminatory practices on the protected attribute (e.g., gender) by analyzing the dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data before conducting predictive analysis. In this paper, we show that the key to discrimination discovery and prevention is to find the meaningful partitions that can be used to provide quantitative evidences for the judgment of discrimination. With the support of the causal graph, we present a graphical condition for identifying a meaningful partition. Based on that, we develop a simple criterion for the claim of non-discrimination, and propose discrimination removal algorithms which accurately remove discrimination while retaining good data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Ethics and Social Impacts of AI · Privacy-Preserving Technologies in Data