Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing
Yuchen Zhang, Xi Chen, Dengyong Zhou, Michael I. Jordan

TL;DR
This paper introduces a two-stage spectral-EM algorithm for crowdsourcing label inference that achieves near-optimal convergence rates and outperforms several recent methods in experiments.
Contribution
The paper presents a provably efficient algorithm combining spectral methods and EM for multi-class crowdsourcing, with theoretical guarantees and superior empirical performance.
Findings
Achieves near-optimal convergence rate up to a logarithmic factor.
Performs comparably to the most accurate empirical methods.
Outperforms several recently proposed algorithms in experiments.
Abstract
Crowdsourcing is a popular paradigm for effectively collecting labels at low cost. The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Indoor and Outdoor Localization Technologies · Infrastructure Maintenance and Monitoring
