Problem Formulation and Fairness

Samir Passi; Solon Barocas

arXiv:1901.02547·cs.CY·January 16, 2019

Problem Formulation and Fairness

Samir Passi, Solon Barocas

PDF

TL;DR

This paper explores how the process of formulating data science problems involves complex negotiations and often lacks explicit normative considerations, impacting fairness and ethical outcomes.

Contribution

It provides an ethnographic analysis of problem formulation in data science, highlighting its negotiation process and implications for fairness and normative assessment.

Findings

01

Problem formulation is negotiated and elastic.

02

Explicit normative considerations are rarely incorporated.

03

Formulation choices significantly influence fairness outcomes.

Abstract

Formulating data science problems is an uncertain and difficult process. It requires various forms of discretionary work to translate high-level objectives or strategic goals into tractable problems, necessitating, among other things, the identification of appropriate target variables and proxies. While these choices are rarely self-evident, normative assessments of data science projects often take them for granted, even though different translations can raise profoundly different ethical concerns. Whether we consider a data science project fair often has as much to do with the formulation of the problem as any property of the resulting model. Building on six months of ethnographic fieldwork with a corporate data science team---and channeling ideas from sociology and history of science, critical data studies, and early writing on knowledge discovery in databases---we describe the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.