Toward Effective Automated Content Analysis via Crowdsourcing

Jiele Wu; Chau-Wai Wong; Xinyan Zhao; Xianpeng Liu

arXiv:2101.04615·cs.CL·April 6, 2021·1 cites

Toward Effective Automated Content Analysis via Crowdsourcing

Jiele Wu, Chau-Wai Wong, Xinyan Zhao, Xianpeng Liu

PDF

Open Access 1 Models

TL;DR

This paper introduces a quality-aware crowdsourcing system for semantic content analysis that maintains high annotation quality over time through real-time feedback, validated by expert data and machine learning tasks.

Contribution

It proposes a novel feedback mechanism to sustain worker quality in subjective annotation tasks, improving large-scale semantic data collection.

Findings

01

Effective in maintaining annotation quality over extended periods

02

Achieves 70%-80% accuracy in machine learning tasks

03

Validated with expert-labeled datasets

Abstract

Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective features such as semantic connotation, online workers, known for optimizing their hourly earnings, tend to deteriorate in the quality of their responses as they work longer. In this paper, we aim to address this issue by proposing a quality-aware semantic data annotation system. We observe that with timely feedback on workers' performance quantified by quality scores, better informed online workers can maintain the quality of their labeling throughout an extended period of time. We validate the effectiveness of the proposed annotation system through i) evaluating performance based on an expert-labeled dataset, and ii) demonstrating machine learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
everyl12/crisis_emotion_roberta
model· 15 dl· ♡ 1
15 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Mobile Crowdsensing and Crowdsourcing · Topic Modeling