A Bayesian Approach for Sequence Tagging with Crowds
Edwin Simpson, Iryna Gurevych

TL;DR
This paper introduces a Bayesian sequence tagging model that effectively aggregates crowdsourced annotations, accounting for annotator errors and dependencies, leading to improved accuracy and reduced labeling costs in NLP tasks.
Contribution
The paper presents a novel Bayesian approach for sequence tagging that models sequential dependencies and annotator uncertainty, outperforming existing methods in crowdsourced NLP data aggregation.
Findings
Outperforms previous state-of-the-art aggregation methods.
Reduces crowdsourcing costs through more effective active learning.
Improves sequence label accuracy by modeling dependencies and uncertainties.
Abstract
Current methods for sequence tagging, a core task in NLP, are data hungry, which motivates the use of crowdsourcing as a cheap way to obtain labelled data. However, annotators are often unreliable and current aggregation methods cannot capture common types of span annotation errors. To address this, we propose a Bayesian method for aggregating sequence tags that reduces errors by modelling sequential dependencies between the annotations as well as the ground-truth labels. By taking a Bayesian approach, we account for uncertainty in the model due to both annotator errors and the lack of data for modelling annotators who complete few tasks. We evaluate our model on crowdsourced data for named entity recognition, information extraction and argument mining, showing that our sequential model outperforms the previous state of the art. We also find that our approach can reduce crowdsourcing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Topic Modeling
