Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing
Hyeonsu Jeong, Hye Won Chung

TL;DR
This paper introduces a model and algorithm for multi-choice crowdsourcing that recovers the top two answers and confusion probabilities, providing deeper insights into task difficulty and answer plausibility.
Contribution
It proposes a novel two-stage inference algorithm for recovering top-two answers and confusion probabilities, with theoretical optimality and practical applications.
Findings
Algorithm achieves minimax optimal convergence rate.
Outperforms recent algorithms in synthetic and real data.
Enables inference of task difficulty and training with top-two soft labels.
Abstract
Crowdsourcing has emerged as an effective platform for labeling large amounts of data in a cost- and time-efficient manner. Most previous work has focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourcing tasks with the goal of recovering not only the ground truth, but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model in which there are the top two plausible answers for each task, distinguished from the rest of the choices. Task difficulty is quantified by the probability of confusion between the top two, and worker reliability is quantified by the probability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Domain Adaptation and Few-Shot Learning · Indoor and Outdoor Localization Technologies
