Crowdsourcing Utilizing Subgroup Structure of Latent Factor Modeling

Qi Xu; Yubai Yuan; Junhui Wang; Annie Qu

arXiv:2302.02304·stat.ME·September 28, 2023·1 cites

Crowdsourcing Utilizing Subgroup Structure of Latent Factor Modeling

Qi Xu, Yubai Yuan, Junhui Wang, Annie Qu

PDF

Open Access

TL;DR

This paper introduces a two-stage model for multicategory crowdsourcing that leverages subgroup structures of tasks and workers, improving label prediction accuracy despite noisy contributions.

Contribution

The paper proposes a novel latent factor model with subgroup structures and a concordance-based approach, enhancing multicategory crowdsourcing label prediction.

Findings

01

Outperforms existing methods in simulations

02

Estimates latent factors with theoretical consistency

03

Demonstrates superior performance on real data

Abstract

Crowdsourcing has emerged as an alternative solution for collecting large scale labels. However, the majority of recruited workers are not domain experts, so their contributed labels could be noisy. In this paper, we propose a two-stage model to predict the true labels for multicategory classification tasks in crowdsourcing. In the first stage, we fit the observed labels with a latent factor model and incorporate subgroup structures for both tasks and workers through a multi-centroid grouping penalty. Group-specific rotations are introduced to align workers with different task categories to solve multicategory crowdsourcing tasks. In the second stage, we propose a concordance-based approach to identify high-quality worker subgroups who are relied upon to assign labels to tasks. In theory, we show the estimation consistency of the latent factors and the prediction consistency of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Text and Document Classification Technologies · Imbalanced Data Classification Techniques