OpinionRank: Extracting Ground Truth Labels from Unreliable Expert Opinions with Graph-Based Spectral Ranking
Glenn Dawson, Robi Polikar

TL;DR
OpinionRank is a scalable, graph-based spectral algorithm that effectively integrates unreliable crowdsourced labels into trustworthy ground truth labels, outperforming complex models in efficiency and accuracy.
Contribution
It introduces a novel, model-free spectral ranking method for extracting reliable labels from unreliable crowdsourced annotations, improving scalability and interpretability.
Findings
OpinionRank outperforms more complex algorithms in accuracy.
It is scalable to large datasets and many label sources.
Requires fewer computational resources than existing methods.
Abstract
As larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information with which to train sophisticated models. To address this problem, crowdsourcing has emerged as a popular, inexpensive, and efficient data mining solution for performing distributed label collection. However, crowdsourced annotations are inherently untrustworthy, as the labels are provided by anonymous volunteers who may have varying, unreliable expertise. Worse yet, some participants on commonly used platforms such as Amazon Mechanical Turk may be adversarial, and provide intentionally incorrect label information without the end user's knowledge. We discuss three conventional models of the label generation process, describing their parameterizations and the model-based approaches used to solve them. We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
