Unsupervised Crowdsourcing with Accuracy and Cost Guarantees

Yashvardhan Didwania; Jayakrishnan Nair; N. Hemachandra

arXiv:2207.01988·cs.LG·July 6, 2022

Unsupervised Crowdsourcing with Accuracy and Cost Guarantees

Yashvardhan Didwania, Jayakrishnan Nair, N. Hemachandra

PDF

Open Access

TL;DR

This paper introduces algorithms for cost-effective, unsupervised crowdsourcing that guarantees accuracy thresholds by modeling worker classes and inferring true labels, with proven near-optimal costs.

Contribution

It proposes novel algorithms for label acquisition and inference in crowdsourcing, with theoretical guarantees and validation for large item sets.

Findings

01

Algorithms achieve prescribed error thresholds with near-optimal costs.

02

Validation through extensive case study supports theoretical results.

03

Models worker heterogeneity via unknown confusion matrices.

Abstract

We consider the problem of cost-optimal utilization of a crowdsourcing platform for binary, unsupervised classification of a collection of items, given a prescribed error threshold. Workers on the crowdsourcing platform are assumed to be divided into multiple classes, based on their skill, experience, and/or past performance. We model each worker class via an unknown confusion matrix, and a (known) price to be paid per label prediction. For this setting, we propose algorithms for acquiring label predictions from workers, and for inferring the true labels of items. We prove that if the number of (unlabeled) items available is large enough, our algorithms satisfy the prescribed error thresholds, incurring a cost that is near-optimal. Finally, we validate our algorithms, and some heuristics inspired by them, through an extensive case study.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Imbalanced Data Classification Techniques · Mobile Crowdsensing and Crowdsourcing