Multi-Label Annotation Aggregation in Crowdsourcing
Xuan Wei, Daniel Dajun Zeng, Junming Yin

TL;DR
This paper introduces Bayesian models and algorithms for aggregating noisy multi-label annotations in crowdsourcing, effectively accounting for annotator reliability and label dependencies to improve accuracy.
Contribution
It proposes novel Bayesian models and inference algorithms specifically designed for multi-label crowdsourcing annotation aggregation, considering label dependencies and annotator reliability.
Findings
Proposed methods outperform existing approaches on real datasets.
Models accurately recover annotator types.
Enhanced annotation accuracy in multi-label crowdsourcing tasks.
Abstract
As a means of human-based computation, crowdsourcing has been widely used to annotate large-scale unlabeled datasets. One of the obvious challenges is how to aggregate these possibly noisy labels provided by a set of heterogeneous annotators. Another challenge stems from the difficulty in evaluating the annotator reliability without even knowing the ground truth, which can be used to build incentive mechanisms in crowdsourcing platforms. When each instance is associated with many possible labels simultaneously, the problem becomes even harder because of its combinatorial nature. In this paper, we present new flexible Bayesian models and efficient inference algorithms for multi-label annotation aggregation by taking both annotator reliability and label dependency into account. Extensive experiments on real-world datasets confirm that the proposed methods outperform other competitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Machine Learning and Data Classification
