Eliciting and Learning with Soft Labels from Every Annotator

Katherine M. Collins; Umang Bhatt; Adrian Weller

arXiv:2207.00810·cs.LG·August 31, 2022

Eliciting and Learning with Soft Labels from Every Annotator

Katherine M. Collins, Umang Bhatt, Adrian Weller

PDF

1 Repo

TL;DR

This paper introduces a method for efficiently collecting soft labels from individual annotators to improve machine learning model performance, reducing the need for multiple annotators and providing a new dataset for research.

Contribution

The paper presents a novel elicitation methodology for soft labels, a new dataset CIFAR-10S, and demonstrates comparable model performance with fewer annotators.

Findings

01

Learning with soft labels improves model performance.

02

Fewer annotators are needed to achieve similar results.

03

The elicitation process has significant temporal costs.

Abstract

The labels used to train machine learning (ML) models are of paramount importance. Typically for ML classification tasks, datasets contain hard labels, yet learning using soft labels has been shown to yield benefits for model generalization, robustness, and calibration. Earlier work found success in forming soft labels from multiple annotators' hard labels; however, this approach may not converge to the best labels and necessitates many annotators, which can be expensive and inefficient. We focus on efficiently eliciting soft labels from individual annotators. We collect and release a dataset of soft labels (which we call CIFAR-10S) over the CIFAR-10 test set via a crowdsourcing study (N=248). We demonstrate that learning with our labels achieves comparable model performance to prior approaches while requiring far fewer annotators -- albeit with significant temporal costs per…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cambridge-mlg/cifar-10s
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest