TL;DR
This paper explores active learning strategies to improve the detection of personal employment disclosures on Twitter across multiple languages, demonstrating significant gains in accuracy with minimal labeled data.
Contribution
It evaluates three active learning methods for multilingual employment status detection, showing their effectiveness in extreme class imbalance scenarios using BERT models.
Findings
Active learning improves precision and recall significantly.
Few iterations of active learning yield substantial performance gains.
No single active learning strategy is universally best.
Abstract
Detecting disclosures of individuals' employment status on social media can provide valuable information to match job seekers with suitable vacancies, offer social protection, or measure labor market flows. However, identifying such personal disclosures is a challenging task due to their rarity in a sea of social media content and the variety of linguistic forms used to describe them. Here, we examine three Active Learning (AL) strategies in real-world settings of extreme class imbalance, and identify five types of disclosures about individuals' employment status (e.g. job loss) in three languages using BERT-based classification models. Our findings show that, even under extreme imbalance settings, a small number of AL iterations is sufficient to obtain large and significant gains in precision, recall, and diversity of results compared to a supervised baseline with the same number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗worldbank/bert-twitter-en-is-hiredmodel· 7 dl7 dl
- 🤗worldbank/bert-twitter-en-is-unemployedmodel· 6 dl6 dl
- 🤗worldbank/bert-twitter-en-job-offermodel· 5 dl5 dl
- 🤗worldbank/bert-twitter-en-lost-jobmodel· 2 dl· ♡ 12 dl♡ 1
- 🤗worldbank/bert-twitter-en-job-searchmodel· 3 dl3 dl
- 🤗worldbank/bert-twitter-es-is-hiredmodel· 3 dl3 dl
- 🤗worldbank/bert-twitter-es-lost-jobmodel· 1 dl1 dl
- 🤗worldbank/bert-twitter-es-is-unemployedmodel· 5 dl5 dl
- 🤗worldbank/bert-twitter-es-job-offermodel· 6 dl6 dl
- 🤗worldbank/bert-twitter-es-job-searchmodel· 13 dl13 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Dropout
