Twitter Job/Employment Corpus: A Dataset of Job-Related Discourse Built with Humans in the Loop
Tong Liu, Christopher M. Homan

TL;DR
This paper introduces a new annotated Twitter dataset focused on job-related discourse, created through a human-in-the-loop process, aiming to facilitate research on employment topics in social media.
Contribution
The paper presents a novel, publicly available Twitter corpus annotated with human expertise, enhancing the study of job-related communication on social media.
Findings
Provides a benchmark dataset for job-related topic extraction
Enables advanced analysis of employment discourse on Twitter
Integrates crowdsourcing with expert annotation for high-quality data
Abstract
We present the Twitter Job/Employment Corpus, a collection of tweets annotated by a humans-in-the-loop supervised learning framework that integrates crowdsourcing contributions and expertise on the local community and employment environment. Previous computational studies of job-related phenomena have used corpora collected from workplace social media that are hosted internally by the employers, and so lacks independence from latent job-related coercion and the broader context that an open domain, general-purpose medium such as Twitter provides. Our new corpus promises to be a benchmark for the extraction of job-related topics and advanced analysis and modeling, and can potentially benefit a wide range of research communities in the future.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Sentiment Analysis and Opinion Mining · Complex Network Analysis Techniques
