Optimizing the Wisdom of the Crowd: Inference, Learning, and Teaching
Yao Zhou, Jingrui He

TL;DR
This paper explores how to improve crowdsourcing label inference by considering worker diversity and correlations, addressing inference, learning, and teaching to enhance label quality beyond traditional aggregation methods.
Contribution
It introduces a novel framework that incorporates worker diversity and correlations into crowdsourcing, advancing beyond classic aggregation models for better label inference.
Findings
Incorporating worker diversity improves label inference accuracy.
Modeling worker correlations enhances learning from crowdsourced data.
Proposed methods outperform traditional aggregation approaches.
Abstract
The unprecedented demand for large amount of data has catalyzed the trend of combining human insights with machine learning techniques, which facilitate the use of crowdsourcing to enlist label information both effectively and efficiently. The classic work on crowdsourcing mainly focuses on the label inference problem under the categorization setting. However, inferring the true label requires sophisticated aggregation models that usually can only perform well under certain assumptions. Meanwhile, no matter how complicated the aggregation model is, the true model that generated the crowd labels remains unknown. Therefore, the label inference problem can never infer the ground truth perfectly. Based on the fact that the crowdsourcing labels are abundant and utilizing aggregation will lose such kind of rich annotation information (e.g., which worker provided which labels), we believe that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · Privacy-Preserving Technologies in Data
