Truth Discovery in Sequence Labels from Crowds

Nasim Sabetpour; Adithya Kulkarni; Sihong Xie; Qi Li

arXiv:2109.04470·cs.HC·July 4, 2023

Truth Discovery in Sequence Labels from Crowds

Nasim Sabetpour, Adithya Kulkarni, Sihong Xie, Qi Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces AggSLC, an optimization-based method for accurately inferring true sequence labels from crowdsourced annotations, effectively handling dependencies and worker reliability in NLP tasks.

Contribution

It proposes a novel aggregation algorithm for sequential labels that considers task characteristics, worker reliability, and machine learning, with proven convergence and superior performance.

Findings

01

Outperforms existing aggregation methods on NER and biomedical datasets.

02

Demonstrates effective handling of sequential dependencies and worker reliability.

03

Shows convergence of the proposed algorithm after finite iterations.

Abstract

Annotation quality and quantity positively affect the learning performance of sequence labeling, a vital task in Natural Language Processing. Hiring domain experts to annotate a corpus is very costly in terms of money and time. Crowdsourcing platforms, such as Amazon Mechanical Turk (AMT), have been deployed to assist in this purpose. However, the annotations collected this way are prone to human errors due to the lack of expertise of the crowd workers. Existing literature in annotation aggregation assumes that annotations are independent and thus faces challenges when handling the sequential label aggregation tasks with complex dependencies. To conquer the challenges, we propose an optimization-based method that infers the ground truth labels using annotations provided by workers for sequential labeling tasks. The proposed Aggregation method for Sequential Labels from Crowds ( $A g g S L C$ )…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nasimisu/truth-discovery-in-sequence-labels-from-crowds
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Machine Learning and Data Classification · Data Stream Mining Techniques