Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization
Shahana Ibrahim, Xiao Fu

TL;DR
This paper introduces a novel symmetric nonnegative matrix factorization approach for unsupervised crowdsourcing label aggregation, improving identifiability and efficiency over previous methods, with practical algorithms for co-occurrence imputation.
Contribution
It recasts Dawid-Skene model learning as a SymNMF problem, enhancing identifiability and proposing lightweight algorithms for co-occurrence imputation and model identification.
Findings
Enhanced identifiability of the Dawid-Skene model.
Effective co-occurrence imputation algorithms.
Provable stability and convergence of the proposed method.
Abstract
Unsupervised learning of the Dawid-Skene (D&S) model from noisy, incomplete and crowdsourced annotations has been a long-standing challenge, and is a critical step towards reliably labeling massive data. A recent work takes a coupled nonnegative matrix factorization (CNMF) perspective, and shows appealing features: It ensures the identifiability of the D\&S model and enjoys low sample complexity, as only the estimates of the co-occurrences of annotator labels are involved. However, the identifiability holds only when certain somewhat restrictive conditions are met in the context of crowdsourcing. Optimizing the CNMF criterion is also costly -- and convergence assurances are elusive. This work recasts the pairwise co-occurrence based D&S model learning problem as a symmetric NMF (SymNMF) problem -- which offers enhanced identifiability relative to CNMF. In practice, the SymNMF model is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Recommender Systems and Techniques · Multimodal Machine Learning Applications
