Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation
Andrew Smart, Ding Wang, Ellis Monk, Mark D\'iaz, Atoosa Kasirzadeh,, Erin Van Liemt, Sonja Schmer-Galunder

TL;DR
This paper critically examines data annotation's social and psychological dimensions, highlighting its WEIRD bias, impact on social categorization, and implications for AI fairness and generalization.
Contribution
It offers a genealogy of data annotation, linking psychological critiques to social theory, and proposes a framework for understanding its global social and subjective aspects.
Findings
Data annotation reflects WEIRD biases affecting non-WEIRD workers.
Annotation practices may entrench static social categories.
Global social conditions influence annotation and model fairness.
Abstract
Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from aspects of annotation work. This paper outlines a critical genealogy of data annotation; starting with its psychological and perceptual aspects. We draw on similarities with critiques of the rise of computerized lab-based psychological experiments in the 1970's which question whether these experiments permit the generalization of results beyond the laboratory settings within which these results are typically obtained. Do data annotations permit the generalization of results beyond the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Digital Humanities and Scholarship
MethodsNetwork On Network
