OpinionRank: Extracting Ground Truth Labels from Unreliable Expert   Opinions with Graph-Based Spectral Ranking

Glenn Dawson; Robi Polikar

arXiv:2102.05884·cs.LG·June 8, 2021

OpinionRank: Extracting Ground Truth Labels from Unreliable Expert Opinions with Graph-Based Spectral Ranking

Glenn Dawson, Robi Polikar

PDF

TL;DR

OpinionRank is a scalable, graph-based spectral algorithm that effectively integrates unreliable crowdsourced labels into trustworthy ground truth labels, outperforming complex models in efficiency and accuracy.

Contribution

It introduces a novel, model-free spectral ranking method for extracting reliable labels from unreliable crowdsourced annotations, improving scalability and interpretability.

Findings

01

OpinionRank outperforms more complex algorithms in accuracy.

02

It is scalable to large datasets and many label sources.

03

Requires fewer computational resources than existing methods.

Abstract

As larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information with which to train sophisticated models. To address this problem, crowdsourcing has emerged as a popular, inexpensive, and efficient data mining solution for performing distributed label collection. However, crowdsourced annotations are inherently untrustworthy, as the labels are provided by anonymous volunteers who may have varying, unreliable expertise. Worse yet, some participants on commonly used platforms such as Amazon Mechanical Turk may be adversarial, and provide intentionally incorrect label information without the end user's knowledge. We discuss three conventional models of the label generation process, describing their parameterizations and the model-based approaches used to solve them. We then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.