Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels
Chao Gao, Dengyong Zhou

TL;DR
This paper establishes the optimal convergence rates of a projected EM algorithm for the Dawid-Skene estimator, providing theoretical guarantees for its effectiveness in inferring true labels from crowdsourced noisy labels.
Contribution
It proves the convergence rates of the Dawid-Skene estimator's EM algorithm are optimal and addresses a longstanding theoretical question about its guarantees.
Findings
Proved optimal convergence rates for the Dawid-Skene estimator.
Demonstrated the theoretical soundness of the Dawid-Skene estimator.
Compared Dawid-Skene with majority voting, highlighting advantages and pitfalls.
Abstract
Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by crowdsourcing workers is Dawid-Skene estimator. In this paper, we prove convergence rates of a projected EM algorithm for the Dawid-Skene estimator. The revealed exponent in the rate of convergence is shown to be optimal via a lower bound argument. Our work resolves the long standing issue of whether Dawid-Skene estimator has sound theoretical guarantees besides its good performance observed in practice. In addition, a comparative study with majority voting illustrates both advantages and pitfalls of the Dawid-Skene estimator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Image and Video Quality Assessment · Data Stream Mining Techniques
