Recovering Top-Two Answers and Confusion Probability in Multi-Choice   Crowdsourcing

Hyeonsu Jeong; Hye Won Chung

arXiv:2301.00006·cs.HC·June 1, 2023

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

Hyeonsu Jeong, Hye Won Chung

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a model and algorithm for multi-choice crowdsourcing that recovers the top two answers and confusion probabilities, providing deeper insights into task difficulty and answer plausibility.

Contribution

It proposes a novel two-stage inference algorithm for recovering top-two answers and confusion probabilities, with theoretical optimality and practical applications.

Findings

01

Algorithm achieves minimax optimal convergence rate.

02

Outperforms recent algorithms in synthetic and real data.

03

Enables inference of task difficulty and training with top-two soft labels.

Abstract

Crowdsourcing has emerged as an effective platform for labeling large amounts of data in a cost- and time-efficient manner. Most previous work has focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourcing tasks with the goal of recovering not only the ground truth, but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model in which there are the top two plausible answers for each task, distinguished from the rest of the choices. Task difficulty is quantified by the probability of confusion between the top two, and worker reliability is quantified by the probability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hyeonsu-jeong/toptwo
pytorchOfficial

Videos

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing· slideslive

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Domain Adaptation and Few-Shot Learning · Indoor and Outdoor Localization Technologies