Crowdsourcing a Word-Emotion Association Lexicon
Saif M. Mohammad, Peter D. Turney

TL;DR
This paper demonstrates how crowdsourcing can efficiently create large, high-quality emotion and polarity lexicons by addressing annotation challenges and optimizing question formulation for better agreement.
Contribution
It introduces a novel crowdsourcing approach for building emotion lexicons, including strategies to improve annotation quality and sense-level data collection.
Findings
Crowdsourcing effectively generates large emotion lexicons.
Question formulation impacts inter-annotator agreement.
Inclusion of word choice questions improves data quality.
Abstract
Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Spam and Phishing Detection · Complex Network Analysis Techniques
