Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets
Yunpeng Zhao, Mattia Prosperi, Tianchen Lyu, Yi Guo, Jiang Bian

TL;DR
This paper presents a combined crowdsourcing and active learning approach to efficiently annotate tweets for classifying work-life events, reducing manual effort while maintaining high annotation quality.
Contribution
It introduces a novel pipeline integrating crowdsourcing with active learning strategies to improve annotation efficiency for social media data.
Findings
Crowdsourcing yields high-quality annotations for tweets.
Active learning reduces the number of tweets needed for training.
No significant difference among tested active learning strategies.
Abstract
Social media, especially Twitter, is being increasingly used for research with predictive analytics. In social media studies, natural language processing (NLP) techniques are used in conjunction with expert-based, manual and qualitative analyses. However, social media data are unstructured and must undergo complex manipulation for research use. The manual annotation is the most resource and time-consuming process that multiple expert raters have to reach consensus on every item, but is essential to create gold-standard datasets for training NLP-based machine learning classifiers. To reduce the burden of the manual annotation, yet maintaining its reliability, we devised a crowdsourcing pipeline combined with active learning strategies. We demonstrated its effectiveness through a case study that identifies job loss events from individual tweets. We used Amazon Mechanical Turk platform to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Spam and Phishing Detection · Machine Learning and Algorithms
