Crowd-Calibrator: Can Annotator Disagreement Inform Calibration in   Subjective Tasks?

Urja Khurana; Eric Nalisnick; Antske Fokkens; Swabha Swayamdipta

arXiv:2408.14141·cs.CL·August 27, 2024

Crowd-Calibrator: Can Annotator Disagreement Inform Calibration in Subjective Tasks?

Urja Khurana, Eric Nalisnick, Antske Fokkens, Swabha Swayamdipta

PDF

Open Access

TL;DR

This paper introduces Crowd-Calibrator, a method that uses annotator disagreement to improve model calibration in subjective NLP tasks, allowing models to better handle uncertainty and abstain when appropriate.

Contribution

It proposes a novel calibration approach that incorporates crowd worker disagreement, enhancing model decision-making in subjective tasks.

Findings

01

Outperforms existing selective prediction baselines on hate speech detection.

02

Achieves competitive performance on natural language inference.

03

Highlights the importance of human disagreement in model calibration.

Abstract

Subjective tasks in NLP have been mostly relegated to objective standards, where the gold label is decided by taking the majority vote. This obfuscates annotator disagreement and the inherent uncertainty of the label. We argue that subjectivity should factor into model decisions and play a direct role via calibration under a selective prediction setting. Specifically, instead of calibrating confidence purely from the model's perspective, we calibrate models for subjective tasks based on crowd worker agreement. Our method, Crowd-Calibrator, models the distance between the distribution of crowd worker labels and the model's own distribution over labels to inform whether the model should abstain from a decision. On two highly subjective tasks, hate speech detection and natural language inference, our experiments show Crowd-Calibrator either outperforms or achieves competitive performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts