Locating disparities in machine learning

Moritz von Zahn; Oliver Hinz; Stefan Feuerriegel

arXiv:2208.06680·cs.LG·January 25, 2024

Locating disparities in machine learning

Moritz von Zahn, Oliver Hinz, Stefan Feuerriegel

PDF

Open Access 1 Repo

TL;DR

This paper introduces ALD, a versatile data-driven framework that automatically detects disparities in machine learning outcomes across various subgroups, even with high-dimensional data and unknown sensitive attributes.

Contribution

ALD is a novel, flexible method capable of locating disparities in machine learning models regardless of classifier type, disparity definition, or predictor complexity.

Findings

01

ALD effectively detects disparities in synthetic datasets.

02

ALD successfully identifies disparities in real-world datasets.

03

ALD produces interpretable audit reports for practitioners.

Abstract

Machine learning can provide predictions with disparate outcomes, in which subgroups of the population (e.g., defined by age, gender, or other sensitive attributes) are systematically disadvantaged. In order to comply with upcoming legislation, practitioners need to locate such disparate outcomes. However, previous literature typically detects disparities through statistical procedures for when the sensitive attribute is specified a priori. This limits applicability in real-world settings where datasets are high dimensional and, on top of that, sensitive attributes may be unknown. As a remedy, we propose a data-driven framework called Automatic Location of Disparities (ALD) which aims at locating disparities in machine learning. ALD meets several demands from industry: ALD (1) is applicable to arbitrary machine learning classifiers; (2) operates on different definitions of disparities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

moritzvz/ald
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Analysis with R