Modeling sequential annotations for sequence labeling with crowds
Xiaolei Lu, Tommy W.S.Chow

TL;DR
This paper introduces a probabilistic model for crowd-based sequence labeling that jointly estimates annotator expertise and label sequences, improving accuracy and efficiency in building large labeled datasets.
Contribution
It proposes a novel joint modeling approach with a label sequence inference method to enhance crowd annotation quality for sequence labeling tasks.
Findings
The model effectively captures annotator expertise and label dependencies.
VLSE reduces candidate sequences and improves ground-truth estimation.
Experimental results demonstrate improved accuracy on NLP sequence labeling tasks.
Abstract
Crowd sequential annotations can be an efficient and cost-effective way to build large datasets for sequence labeling. Different from tagging independent instances, for crowd sequential annotations the quality of label sequence relies on the expertise level of annotators in capturing internal dependencies for each token in the sequence. In this paper, we propose Modeling sequential annotation for sequence labeling with crowds (SA-SLC). First, a conditional probabilistic model is developed to jointly model sequential data and annotators' expertise, in which categorical distribution is introduced to estimate the reliability of each annotator in capturing local and non-local label dependency for sequential annotation. To accelerate the marginalization of the proposed model, a valid label sequence inference (VLSE) method is proposed to derive the valid ground-truth label sequences from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
