Statistical Decision Making for Optimal Budget Allocation in Crowd Labeling
Xi Chen, Qihang Lin, Dengyong Zhou

TL;DR
This paper develops an optimal budget allocation strategy for crowd labeling tasks using a Bayesian MDP framework, introducing an efficient approximate policy that improves labeling accuracy over existing methods.
Contribution
It formulates the budget allocation as a Bayesian MDP and proposes a computationally efficient optimistic knowledge gradient policy for better accuracy.
Findings
The proposed policy outperforms existing methods in simulated data.
It achieves higher labeling accuracy at the same budget level.
Applicable to both homogeneous and heterogeneous worker marketplaces.
Abstract
In crowd labeling, a large amount of unlabeled data instances are outsourced to a crowd of workers. Workers will be paid for each label they provide, but the labeling requester usually has only a limited amount of the budget. Since data instances have different levels of labeling difficulty and workers have different reliability, it is desirable to have an optimal policy to allocate the budget among all instance-worker pairs such that the overall labeling accuracy is maximized. We consider categorical labeling tasks and formulate the budget allocation problem as a Bayesian Markov decision process (MDP), which simultaneously conducts learning and decision making. Using the dynamic programming (DP) recurrence, one can obtain the optimal allocation policy. However, DP quickly becomes computationally intractable when the size of the problem increases. To solve this challenge, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Auction Theory and Applications · Data Stream Mining Techniques
