Revisiting Active Learning under (Human) Label Variation

Cornelia Gruber; Helen Alber; Bernd Bischl; G\"oran Kauermann; Barbara Plank; Matthias A{\ss}enmacher

arXiv:2507.02593·cs.CL·July 4, 2025

Revisiting Active Learning under (Human) Label Variation

Cornelia Gruber, Helen Alber, Bernd Bischl, G\"oran Kauermann, Barbara Plank, Matthias A{\ss}enmacher

PDF

Open Access 1 Video

TL;DR

This paper explores the impact of human label variation on active learning, proposing a framework to incorporate plausible label differences and improve annotation strategies in real-world scenarios.

Contribution

It introduces a conceptual framework for integrating human label variation into active learning processes, addressing overlooked complexities in label quality and annotation practices.

Findings

01

Decomposition of label variation into signal and noise.

02

Survey of existing approaches to label variation and active learning.

03

Proposal of a HLV-aware active learning framework.

Abstract

Access to high-quality labeled data remains a limiting factor in applied supervised learning. While label variation (LV), i.e., differing labels for the same instance, is common, especially in natural language processing, annotation frameworks often still rest on the assumption of a single ground truth. This overlooks human label variation (HLV), the occurrence of plausible differences in annotations, as an informative signal. Similarly, active learning (AL), a popular approach to optimizing the use of limited annotation budgets in training ML models, often relies on at least one of several simplifying assumptions, which rarely hold in practice when acknowledging HLV. In this paper, we examine foundational assumptions about truth and label nature, highlighting the need to decompose observed LV into signal (e.g., HLV) and noise (e.g., annotation error). We survey how the AL and (H)LV…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Revisiting Active Learning under (Human) Label Variation· underline

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Natural Language Processing Techniques