Crowdsourcing with Sparsely Interacting Workers
Yao Ma, Alex Olshevsky, Venkatesh Saligrama, Csaba Szepesvari

TL;DR
This paper investigates how to estimate worker skills in crowdsourcing from sparse interaction data, establishing conditions for identifiability, proposing a gradient-based estimation method, and demonstrating state-of-the-art results on real datasets.
Contribution
It introduces a graph-theoretic framework for skill estimation, characterizes conditions for identifiability, and develops a gradient descent algorithm with proven convergence and robustness.
Findings
Skills are identifiable if the interaction graph is irreducible with odd cycles.
The proposed gradient descent converges to the global minimum asymptotically.
The plug-in estimator achieves state-of-the-art performance on real datasets.
Abstract
We consider estimation of worker skills from worker-task interaction data (with unknown labels) for the single-coin crowd-sourcing binary classification model in symmetric noise. We define the (worker) interaction graph whose nodes are workers and an edge between two nodes indicates whether or not the two workers participated in a common task. We show that skills are asymptotically identifiable if and only if an appropriate limiting version of the interaction graph is irreducible and has odd-cycles. We then formulate a weighted rank-one optimization problem to estimate skills based on observations on an irreducible, aperiodic interaction graph. We propose a gradient descent scheme and show that for such interaction graphs estimates converge asymptotically to the global minimum. We characterize noise robustness of the gradient scheme in terms of spectral properties of signless Laplacians…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Privacy-Preserving Technologies in Data · Domain Adaptation and Few-Shot Learning
