"Is a picture of a bird a bird": Policy recommendations for dealing with ambiguity in machine vision models
Alicia Parrish, Sarah Laszlo, Lora Aroyo

TL;DR
This paper explores the inherent ambiguity in labeling images for machine vision, highlighting sources of subjectivity and proposing best practices to handle label ambiguity in datasets.
Contribution
It identifies key sources of ambiguity in image labeling and offers policy recommendations for managing subjective judgments in machine learning datasets.
Findings
Identified three primary sources of ambiguity in image labeling.
Demonstrated the impact of ambiguity on model training.
Suggested best practices for handling label ambiguity.
Abstract
Many questions that we ask about the world do not have a single clear answer, yet typical human annotation set-ups in machine learning assume there must be a single ground truth label for all examples in every task. The divergence between reality and practice is stark, especially in cases with inherent ambiguity and where the range of different subjective judgments is wide. Here, we examine the implications of subjective human judgments in the behavioral task of labeling images used to train machine vision models. We identify three primary sources of ambiguity arising from (i) depictions of labels in the images, (ii) raters' backgrounds, and (iii) the task definition. On the basis of the empirical results, we suggest best practices for handling label ambiguity in machine learning datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
