End-to-End Learning from Noisy Crowd to Supervised Machine Learning   Models

Taraneh Younesian; Chi Hong; Amirmasoud Ghiassi; Robert Birke; Lydia; Y. Chen

arXiv:2011.06833·cs.LG·November 16, 2020

End-to-End Learning from Noisy Crowd to Supervised Machine Learning Models

Taraneh Younesian, Chi Hong, Amirmasoud Ghiassi, Robert Birke, Lydia, Y. Chen

PDF

TL;DR

This paper proposes an end-to-end hybrid approach combining deep models and human experts to improve label quality from noisy crowd-sourced data, significantly enhancing supervised learning accuracy.

Contribution

It introduces an online label aggregation framework that estimates annotator confusion matrices and uses expert relabeling to reduce noise and improve classification performance.

Findings

01

Label error rate reduced by over 30% with aggregation.

02

Relabeling 10% of data yields over 90% accuracy.

03

Effective on image datasets like UCI and CIFAR-10.

Abstract

Labeling real-world datasets is time consuming but indispensable for supervised machine learning models. A common solution is to distribute the labeling task across a large number of non-expert workers via crowd-sourcing. Due to the varying background and experience of crowd workers, the obtained labels are highly prone to errors and even detrimental to the learning models. In this paper, we advocate using hybrid intelligence, i.e., combining deep models and human experts, to design an end-to-end learning framework from noisy crowd-sourced data, especially in an on-line scenario. We first summarize the state-of-the-art solutions that address the challenges of noisy labels from non-expert crowd and learn from multiple annotators. We show how label aggregation can benefit from estimating the annotators' confusion matrices to improve the learning process. Moreover, with the help of an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSupport Vector Machine