Modeling sequential annotations for sequence labeling with crowds

Xiaolei Lu; Tommy W.S.Chow

arXiv:2209.09430·cs.CL·September 21, 2022

Modeling sequential annotations for sequence labeling with crowds

Xiaolei Lu, Tommy W.S.Chow

PDF

TL;DR

This paper introduces a probabilistic model for crowd-based sequence labeling that jointly estimates annotator expertise and label sequences, improving accuracy and efficiency in building large labeled datasets.

Contribution

It proposes a novel joint modeling approach with a label sequence inference method to enhance crowd annotation quality for sequence labeling tasks.

Findings

01

The model effectively captures annotator expertise and label dependencies.

02

VLSE reduces candidate sequences and improves ground-truth estimation.

03

Experimental results demonstrate improved accuracy on NLP sequence labeling tasks.

Abstract

Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this paper, we propose Modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to jointly model sequential data and annotators' expertise, in which categorical distribution is introduced to estimate the reliability of each annotator in capturing local and non-local label dependency for sequential annotation. To accelerate the marginalization of the proposed model, a valid label sequence inference (VLSE) method is proposed to derive the valid ground-truth label sequences from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.