End-to-End Learning from Noisy Crowd to Supervised Machine Learning Models
Taraneh Younesian, Chi Hong, Amirmasoud Ghiassi, Robert Birke, Lydia, Y. Chen

TL;DR
This paper proposes an end-to-end hybrid approach combining deep models and human experts to improve label quality from noisy crowd-sourced data, significantly enhancing supervised learning accuracy.
Contribution
It introduces an online label aggregation framework that estimates annotator confusion matrices and uses expert relabeling to reduce noise and improve classification performance.
Findings
Label error rate reduced by over 30% with aggregation.
Relabeling 10% of data yields over 90% accuracy.
Effective on image datasets like UCI and CIFAR-10.
Abstract
Labeling real-world datasets is time consuming but indispensable for supervised machine learning models. A common solution is to distribute the labeling task across a large number of non-expert workers via crowd-sourcing. Due to the varying background and experience of crowd workers, the obtained labels are highly prone to errors and even detrimental to the learning models. In this paper, we advocate using hybrid intelligence, i.e., combining deep models and human experts, to design an end-to-end learning framework from noisy crowd-sourced data, especially in an on-line scenario. We first summarize the state-of-the-art solutions that address the challenges of noisy labels from non-expert crowd and learn from multiple annotators. We show how label aggregation can benefit from estimating the annotators' confusion matrices to improve the learning process. Moreover, with the help of an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSupport Vector Machine
