The Problems with Proxies: Making Data Work Visible through Requester Practices
Annabel Rothschild, Ding Wang, Niveditha Jayakumar Vilvanathan, Lauren, Wilcox, Carl DiSalvo, and Betsy DiSalvo

TL;DR
This paper examines how current data annotation practices often overlook and undervalue data workers' expertise, leading to ethical issues and data quality problems, and advocates for policy reforms to improve recognition and treatment.
Contribution
It reveals the gap between requesters' perceptions and data workers' expertise, proposing policy changes to improve data annotation practices and ethical standards.
Findings
Requesters often hold naive views of worker capabilities.
Ad-hoc qualification tasks undermine data quality.
Policy reforms can improve data worker recognition and ethical standards.
Abstract
Fairness in AI and ML systems is increasingly linked to the proper treatment and recognition of data workers involved in training dataset development. Yet, those who collect and annotate the data, and thus have the most intimate knowledge of its development, are often excluded from critical discussions. This exclusion prevents data annotators, who are domain experts, from contributing effectively to dataset contextualization. Our investigation into the hiring and engagement practices of 52 data work requesters on platforms like Amazon Mechanical Turk reveals a gap: requesters frequently hold naive or unchallenged notions of worker identities and capabilities and rely on ad-hoc qualification tasks that fail to respect the workers' expertise. These practices not only undermine the quality of data but also the ethical standards of AI development. To rectify these issues, we advocate for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management
