Studying Up Machine Learning Data: Why Talk About Bias When We Mean   Power?

Milagros Miceli; Julian Posada; Tianling Yang

arXiv:2109.08131·cs.HC·September 17, 2021·26 cites

Studying Up Machine Learning Data: Why Talk About Bias When We Mean Power?

Milagros Miceli, Julian Posada, Tianling Yang

PDF

Open Access

TL;DR

This paper advocates shifting from bias-focused to power-aware perspectives in ML dataset research, emphasizing historical, social, and labor contexts to better understand data's societal impact.

Contribution

It introduces a power-aware framework for studying ML datasets, expanding beyond bias to include social, historical, and labor considerations in data analysis.

Findings

01

Highlighting limitations of bias-only approaches

02

Emphasizing importance of social context in data quality

03

Proposing expanded transparency in dataset documentation

Abstract

Research in machine learning (ML) has primarily argued that models trained on incomplete or biased datasets can lead to discriminatory outputs. In this commentary, we propose moving the research focus beyond bias-oriented framings by adopting a power-aware perspective to "study up" ML datasets. This means accounting for historical inequities, labor conditions, and epistemological standpoints inscribed in data. We draw on HCI and CSCW work to support our argument, critically analyze previous research, and point at two co-existing lines of work within our community -- one bias-oriented, the other power-aware. This way, we highlight the need for dialogue and cooperation in three areas: data quality, data work, and data documentation. In the first area, we argue that reducing societal problems to "bias" misses the context-based nature of data. In the second one, we highlight the corporate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Mobile Crowdsensing and Crowdsourcing · Innovative Human-Technology Interaction