Don't Waste a Single Annotation: Improving Single-Label Classifiers   Through Soft Labels

Ben Wu; Yue Li; Yida Mu; Carolina Scarton; Kalina Bontcheva; Xingyi; Song

arXiv:2311.05265·cs.CL·November 10, 2023·1 cites

Don't Waste a Single Annotation: Improving Single-Label Classifiers Through Soft Labels

Ben Wu, Yue Li, Yida Mu, Carolina Scarton, Kalina Bontcheva, Xingyi, Song

PDF

Open Access

TL;DR

This paper proposes a soft label training approach for single-label classifiers that leverages annotator disagreement and confidence to improve model performance and calibration.

Contribution

It introduces a novel method that uses ambiguous annotations and annotator confidence to generate soft labels, enhancing classifier training.

Findings

01

Improved classifier accuracy on test sets.

02

Enhanced calibration of predicted probabilities.

03

Effective use of annotator disagreement information.

Abstract

In this paper, we address the limitations of the common data annotation and training methods for objective single-label classification tasks. Typically, when annotating such tasks annotators are only asked to provide a single label for each sample and annotator disagreement is discarded when a final hard label is decided through majority voting. We challenge this traditional approach, acknowledging that determining the appropriate label can be difficult due to the ambiguity and lack of context in the data samples. Rather than discarding the information from such ambiguous annotations, our soft label method makes use of them for training. Our findings indicate that additional annotator information, such as confidence, secondary label and disagreement, can be used to effectively generate soft labels. Training classifiers with these soft labels then leads to improved performance and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWater Systems and Optimization · Machine Learning and Data Classification · Text and Document Classification Technologies