Generalizable Error Modeling for Human Data Annotation: Evidence From an   Industry-Scale Search Data Annotation Program

Heinrich Peters; Alireza Hashemi; James Rae

arXiv:2310.05286·cs.LG·September 27, 2024

Generalizable Error Modeling for Human Data Annotation: Evidence From an Industry-Scale Search Data Annotation Program

Heinrich Peters, Alireza Hashemi, James Rae

PDF

Open Access

TL;DR

This study develops a generalizable error prediction model for human data annotation in search relevance tasks, demonstrating its effectiveness across multiple applications and improving annotation quality and efficiency.

Contribution

The paper introduces a task-agnostic error prediction model trained on behavioral and task features, outperforming prior label-based approaches and applicable across diverse industry-scale annotation tasks.

Findings

01

Error prediction model achieves AUC of 0.65-0.75.

02

Model generalizes well across different applications.

03

Using the model increases annotation correction efficiency by up to 40%.

Abstract

Machine learning (ML) and artificial intelligence (AI) systems rely heavily on human-annotated data for training and evaluation. A major challenge in this context is the occurrence of annotation errors, as their effects can degrade model performance. This paper presents a predictive error model trained to detect potential errors in search relevance annotation tasks for three industry-scale ML applications (music streaming, video streaming, and mobile apps). Drawing on real-world data from an extensive search relevance annotation program, we demonstrate that errors can be predicted with moderate model performance (AUC=0.65-0.75) and that model performance generalizes well across applications (i.e., a global, task-agnostic model performs on par with task-specific models). In contrast to past research, which has often focused on predicting annotation labels from task-specific features, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Data Stream Mining Techniques · Machine Learning and Data Classification